Package org.apache.iceberg.spark.source
Class IcebergSource
java.lang.Object
org.apache.iceberg.spark.source.IcebergSource
- All Implemented Interfaces:
org.apache.spark.sql.connector.catalog.SupportsCatalogOptions
,org.apache.spark.sql.connector.catalog.TableProvider
,org.apache.spark.sql.sources.DataSourceRegister
public class IcebergSource
extends Object
implements org.apache.spark.sql.sources.DataSourceRegister, org.apache.spark.sql.connector.catalog.SupportsCatalogOptions
The IcebergSource loads/writes tables with format "iceberg". It can load paths and tables.
How paths/tables are loaded when using spark.read().format("iceberg").load(table)
table = "file:///path/to/table" -> loads a HadoopTable at given path table = "tablename" -> loads currentCatalog.currentNamespace.tablename table = "catalog.tablename" -> load "tablename" from the specified catalog. table = "namespace.tablename" -> load "namespace.tablename" from current catalog table = "catalog.namespace.tablename" -> "namespace.tablename" from the specified catalog. table = "namespace1.namespace2.tablename" -> load "namespace1.namespace2.tablename" from current catalog
The above list is in order of priority. For example: a matching catalog will take priority over any namespace resolution.
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionextractCatalog
(org.apache.spark.sql.util.CaseInsensitiveStringMap options) org.apache.spark.sql.connector.catalog.Identifier
extractIdentifier
(org.apache.spark.sql.util.CaseInsensitiveStringMap options) extractTimeTravelTimestamp
(org.apache.spark.sql.util.CaseInsensitiveStringMap options) extractTimeTravelVersion
(org.apache.spark.sql.util.CaseInsensitiveStringMap options) org.apache.spark.sql.connector.catalog.Table
getTable
(org.apache.spark.sql.types.StructType schema, org.apache.spark.sql.connector.expressions.Transform[] partitioning, Map<String, String> options) org.apache.spark.sql.connector.expressions.Transform[]
inferPartitioning
(org.apache.spark.sql.util.CaseInsensitiveStringMap options) org.apache.spark.sql.types.StructType
inferSchema
(org.apache.spark.sql.util.CaseInsensitiveStringMap options) boolean
-
Constructor Details
-
IcebergSource
public IcebergSource()
-
-
Method Details
-
shortName
- Specified by:
shortName
in interfaceorg.apache.spark.sql.sources.DataSourceRegister
-
inferSchema
public org.apache.spark.sql.types.StructType inferSchema(org.apache.spark.sql.util.CaseInsensitiveStringMap options) - Specified by:
inferSchema
in interfaceorg.apache.spark.sql.connector.catalog.TableProvider
-
inferPartitioning
public org.apache.spark.sql.connector.expressions.Transform[] inferPartitioning(org.apache.spark.sql.util.CaseInsensitiveStringMap options) - Specified by:
inferPartitioning
in interfaceorg.apache.spark.sql.connector.catalog.TableProvider
-
supportsExternalMetadata
public boolean supportsExternalMetadata()- Specified by:
supportsExternalMetadata
in interfaceorg.apache.spark.sql.connector.catalog.TableProvider
-
getTable
public org.apache.spark.sql.connector.catalog.Table getTable(org.apache.spark.sql.types.StructType schema, org.apache.spark.sql.connector.expressions.Transform[] partitioning, Map<String, String> options) - Specified by:
getTable
in interfaceorg.apache.spark.sql.connector.catalog.TableProvider
-
extractIdentifier
public org.apache.spark.sql.connector.catalog.Identifier extractIdentifier(org.apache.spark.sql.util.CaseInsensitiveStringMap options) - Specified by:
extractIdentifier
in interfaceorg.apache.spark.sql.connector.catalog.SupportsCatalogOptions
-
extractCatalog
- Specified by:
extractCatalog
in interfaceorg.apache.spark.sql.connector.catalog.SupportsCatalogOptions
-
extractTimeTravelVersion
public Optional<String> extractTimeTravelVersion(org.apache.spark.sql.util.CaseInsensitiveStringMap options) - Specified by:
extractTimeTravelVersion
in interfaceorg.apache.spark.sql.connector.catalog.SupportsCatalogOptions
-
extractTimeTravelTimestamp
public Optional<String> extractTimeTravelTimestamp(org.apache.spark.sql.util.CaseInsensitiveStringMap options) - Specified by:
extractTimeTravelTimestamp
in interfaceorg.apache.spark.sql.connector.catalog.SupportsCatalogOptions
-