org.apache.iceberg.spark.source.IcebergSource

All Implemented Interfaces:: org.apache.spark.sql.connector.catalog.SessionConfigSupport, org.apache.spark.sql.connector.catalog.SupportsCatalogOptions, org.apache.spark.sql.connector.catalog.TableProvider, org.apache.spark.sql.sources.DataSourceRegister

public class IcebergSource extends Object implements org.apache.spark.sql.sources.DataSourceRegister, org.apache.spark.sql.connector.catalog.SupportsCatalogOptions, org.apache.spark.sql.connector.catalog.SessionConfigSupport

The IcebergSource loads/writes tables with format "iceberg". It can load paths and tables.

How paths/tables are loaded when using spark.read().format("iceberg").load(table)

table = "file:///path/to/table" -> loads a HadoopTable at given path table = "tablename" -> loads currentCatalog.currentNamespace.tablename table = "catalog.tablename" -> load "tablename" from the specified catalog. table = "namespace.tablename" -> load "namespace.tablename" from current catalog table = "catalog.namespace.tablename" -> "namespace.tablename" from the specified catalog. table = "namespace1.namespace2.tablename" -> load "namespace1.namespace2.tablename" from current catalog

The above list is in order of priority. For example: a matching catalog will take priority over any namespace resolution.

Constructor Summary

Constructors

Constructor

Description

IcebergSource()
Method Summary

Modifier and Type

Method

Description

String

extractCatalog(org.apache.spark.sql.util.CaseInsensitiveStringMap options)

org.apache.spark.sql.connector.catalog.Identifier

extractIdentifier(org.apache.spark.sql.util.CaseInsensitiveStringMap options)

Optional<String>

extractTimeTravelTimestamp(org.apache.spark.sql.util.CaseInsensitiveStringMap options)

Optional<String>

extractTimeTravelVersion(org.apache.spark.sql.util.CaseInsensitiveStringMap options)

org.apache.spark.sql.connector.catalog.Table

getTable(org.apache.spark.sql.types.StructType schema, org.apache.spark.sql.connector.expressions.Transform[] partitioning, Map<String,String> options)

org.apache.spark.sql.connector.expressions.Transform[]

inferPartitioning(org.apache.spark.sql.util.CaseInsensitiveStringMap options)

org.apache.spark.sql.types.StructType

inferSchema(org.apache.spark.sql.util.CaseInsensitiveStringMap options)

String

keyPrefix()

String

shortName()

boolean

supportsExternalMetadata()

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Details
- IcebergSource
  
  public IcebergSource()
Method Details
- shortName
  
  public String shortName()
  
  Specified by:
  
  shortName in interface org.apache.spark.sql.sources.DataSourceRegister
- keyPrefix
  
  public String keyPrefix()
  
  Specified by:
  
  keyPrefix in interface org.apache.spark.sql.connector.catalog.SessionConfigSupport
- inferSchema
  
  public org.apache.spark.sql.types.StructType inferSchema(org.apache.spark.sql.util.CaseInsensitiveStringMap options)
  
  Specified by:
  
  inferSchema in interface org.apache.spark.sql.connector.catalog.TableProvider
- inferPartitioning
  
  public org.apache.spark.sql.connector.expressions.Transform[] inferPartitioning(org.apache.spark.sql.util.CaseInsensitiveStringMap options)
  
  Specified by:
  
  inferPartitioning in interface org.apache.spark.sql.connector.catalog.TableProvider
- supportsExternalMetadata
  
  public boolean supportsExternalMetadata()
  
  Specified by:
  
  supportsExternalMetadata in interface org.apache.spark.sql.connector.catalog.TableProvider
- getTable
  
  public org.apache.spark.sql.connector.catalog.Table getTable(org.apache.spark.sql.types.StructType schema, org.apache.spark.sql.connector.expressions.Transform[] partitioning, Map<String,String> options)
  
  Specified by:
  
  getTable in interface org.apache.spark.sql.connector.catalog.TableProvider
- extractIdentifier
  
  public org.apache.spark.sql.connector.catalog.Identifier extractIdentifier(org.apache.spark.sql.util.CaseInsensitiveStringMap options)
  
  Specified by:
  
  extractIdentifier in interface org.apache.spark.sql.connector.catalog.SupportsCatalogOptions
- extractCatalog
  
  public String extractCatalog(org.apache.spark.sql.util.CaseInsensitiveStringMap options)
  
  Specified by:
  
  extractCatalog in interface org.apache.spark.sql.connector.catalog.SupportsCatalogOptions
- extractTimeTravelVersion
  
  public Optional<String> extractTimeTravelVersion(org.apache.spark.sql.util.CaseInsensitiveStringMap options)
  
  Specified by:
  
  extractTimeTravelVersion in interface org.apache.spark.sql.connector.catalog.SupportsCatalogOptions
- extractTimeTravelTimestamp
  
  public Optional<String> extractTimeTravelTimestamp(org.apache.spark.sql.util.CaseInsensitiveStringMap options)
  
  Specified by:
  
  extractTimeTravelTimestamp in interface org.apache.spark.sql.connector.catalog.SupportsCatalogOptions

Class IcebergSource

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Details

IcebergSource

Method Details

shortName

keyPrefix

inferSchema

inferPartitioning

supportsExternalMetadata

getTable

extractIdentifier

extractCatalog

extractTimeTravelVersion

extractTimeTravelTimestamp