Class IcebergSource

  • All Implemented Interfaces:
    org.apache.spark.sql.connector.catalog.SupportsCatalogOptions, org.apache.spark.sql.connector.catalog.TableProvider, org.apache.spark.sql.sources.DataSourceRegister

    public class IcebergSource
    extends java.lang.Object
    implements org.apache.spark.sql.sources.DataSourceRegister, org.apache.spark.sql.connector.catalog.SupportsCatalogOptions
    The IcebergSource loads/writes tables with format "iceberg". It can load paths and tables.

    How paths/tables are loaded when using spark.read().format("iceberg").load(table)

    table = "file:///path/to/table" -> loads a HadoopTable at given path table = "tablename" -> loads currentCatalog.currentNamespace.tablename table = "catalog.tablename" -> load "tablename" from the specified catalog. table = "namespace.tablename" -> load "namespace.tablename" from current catalog table = "catalog.namespace.tablename" -> "namespace.tablename" from the specified catalog. table = "namespace1.namespace2.tablename" -> load "namespace1.namespace2.tablename" from current catalog

    The above list is in order of priority. For example: a matching catalog will take priority over any namespace resolution.

    • Constructor Summary

      Constructors 
      Constructor Description
      IcebergSource()  
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      java.lang.String extractCatalog​(org.apache.spark.sql.util.CaseInsensitiveStringMap options)  
      org.apache.spark.sql.connector.catalog.Identifier extractIdentifier​(org.apache.spark.sql.util.CaseInsensitiveStringMap options)  
      java.util.Optional<java.lang.String> extractTimeTravelTimestamp​(org.apache.spark.sql.util.CaseInsensitiveStringMap options)  
      java.util.Optional<java.lang.String> extractTimeTravelVersion​(org.apache.spark.sql.util.CaseInsensitiveStringMap options)  
      org.apache.spark.sql.connector.catalog.Table getTable​(org.apache.spark.sql.types.StructType schema, org.apache.spark.sql.connector.expressions.Transform[] partitioning, java.util.Map<java.lang.String,​java.lang.String> options)  
      org.apache.spark.sql.connector.expressions.Transform[] inferPartitioning​(org.apache.spark.sql.util.CaseInsensitiveStringMap options)  
      org.apache.spark.sql.types.StructType inferSchema​(org.apache.spark.sql.util.CaseInsensitiveStringMap options)  
      java.lang.String shortName()  
      boolean supportsExternalMetadata()  
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • IcebergSource

        public IcebergSource()
    • Method Detail

      • shortName

        public java.lang.String shortName()
        Specified by:
        shortName in interface org.apache.spark.sql.sources.DataSourceRegister
      • inferSchema

        public org.apache.spark.sql.types.StructType inferSchema​(org.apache.spark.sql.util.CaseInsensitiveStringMap options)
        Specified by:
        inferSchema in interface org.apache.spark.sql.connector.catalog.TableProvider
      • inferPartitioning

        public org.apache.spark.sql.connector.expressions.Transform[] inferPartitioning​(org.apache.spark.sql.util.CaseInsensitiveStringMap options)
        Specified by:
        inferPartitioning in interface org.apache.spark.sql.connector.catalog.TableProvider
      • supportsExternalMetadata

        public boolean supportsExternalMetadata()
        Specified by:
        supportsExternalMetadata in interface org.apache.spark.sql.connector.catalog.TableProvider
      • getTable

        public org.apache.spark.sql.connector.catalog.Table getTable​(org.apache.spark.sql.types.StructType schema,
                                                                     org.apache.spark.sql.connector.expressions.Transform[] partitioning,
                                                                     java.util.Map<java.lang.String,​java.lang.String> options)
        Specified by:
        getTable in interface org.apache.spark.sql.connector.catalog.TableProvider
      • extractIdentifier

        public org.apache.spark.sql.connector.catalog.Identifier extractIdentifier​(org.apache.spark.sql.util.CaseInsensitiveStringMap options)
        Specified by:
        extractIdentifier in interface org.apache.spark.sql.connector.catalog.SupportsCatalogOptions
      • extractCatalog

        public java.lang.String extractCatalog​(org.apache.spark.sql.util.CaseInsensitiveStringMap options)
        Specified by:
        extractCatalog in interface org.apache.spark.sql.connector.catalog.SupportsCatalogOptions
      • extractTimeTravelVersion

        public java.util.Optional<java.lang.String> extractTimeTravelVersion​(org.apache.spark.sql.util.CaseInsensitiveStringMap options)
        Specified by:
        extractTimeTravelVersion in interface org.apache.spark.sql.connector.catalog.SupportsCatalogOptions
      • extractTimeTravelTimestamp

        public java.util.Optional<java.lang.String> extractTimeTravelTimestamp​(org.apache.spark.sql.util.CaseInsensitiveStringMap options)
        Specified by:
        extractTimeTravelTimestamp in interface org.apache.spark.sql.connector.catalog.SupportsCatalogOptions