Class SparkCatalog

java.lang.Object
org.apache.iceberg.spark.SparkCatalog
All Implemented Interfaces:
HasIcebergCatalog, SupportsReplaceView, org.apache.spark.sql.connector.catalog.CatalogPlugin, org.apache.spark.sql.connector.catalog.FunctionCatalog, org.apache.spark.sql.connector.catalog.StagingTableCatalog, org.apache.spark.sql.connector.catalog.SupportsNamespaces, org.apache.spark.sql.connector.catalog.TableCatalog, org.apache.spark.sql.connector.catalog.ViewCatalog, ProcedureCatalog

public class SparkCatalog extends Object implements org.apache.spark.sql.connector.catalog.ViewCatalog, SupportsReplaceView
A Spark TableCatalog implementation that wraps an Iceberg Catalog.

This supports the following catalog configuration options:

  • type - catalog type, "hive" or "hadoop" or "rest". To specify a non-hive or hadoop catalog, use the catalog-impl option.
  • uri - the Hive Metastore URI for Hive catalog or REST URI for REST catalog
  • warehouse - the warehouse path (Hadoop catalog only)
  • catalog-impl - a custom Catalog implementation to use
  • io-impl - a custom FileIO implementation to use
  • metrics-reporter-impl - a custom MetricsReporter implementation to use
  • default-namespace - a namespace to use as the default
  • cache-enabled - whether to enable catalog cache
  • cache.case-sensitive - whether the catalog cache should compare table identifiers in a case sensitive way
  • cache.expiration-interval-ms - interval in millis before expiring tables from catalog cache. Refer to CatalogProperties.CACHE_EXPIRATION_INTERVAL_MS for further details and significant values.
  • table-default.$tablePropertyKey - table property $tablePropertyKey default at catalog level
  • table-override.$tablePropertyKey - table property $tablePropertyKey enforced at catalog level

  • Field Summary

    Fields inherited from interface org.apache.spark.sql.connector.catalog.SupportsNamespaces

    PROP_COMMENT, PROP_LOCATION, PROP_OWNER

    Fields inherited from interface org.apache.spark.sql.connector.catalog.TableCatalog

    OPTION_PREFIX, PROP_COMMENT, PROP_EXTERNAL, PROP_IS_MANAGED_LOCATION, PROP_LOCATION, PROP_OWNER, PROP_PROVIDER

    Fields inherited from interface org.apache.spark.sql.connector.catalog.ViewCatalog

    PROP_COMMENT, PROP_CREATE_ENGINE_VERSION, PROP_ENGINE_VERSION, PROP_OWNER, RESERVED_PROPERTIES
  • Constructor Summary

    Constructors
    Constructor
    Description
     
  • Method Summary

    Modifier and Type
    Method
    Description
    void
    alterNamespace(String[] namespace, org.apache.spark.sql.connector.catalog.NamespaceChange... changes)
     
    org.apache.spark.sql.connector.catalog.Table
    alterTable(org.apache.spark.sql.connector.catalog.Identifier ident, org.apache.spark.sql.connector.catalog.TableChange... changes)
     
    org.apache.spark.sql.connector.catalog.View
    alterView(org.apache.spark.sql.connector.catalog.Identifier ident, org.apache.spark.sql.connector.catalog.ViewChange... changes)
     
    protected Catalog
    buildIcebergCatalog(String name, org.apache.spark.sql.util.CaseInsensitiveStringMap options)
    Build an Iceberg Catalog to be used by this Spark catalog adapter.
    protected TableIdentifier
    buildIdentifier(org.apache.spark.sql.connector.catalog.Identifier identifier)
    Build an Iceberg TableIdentifier for the given Spark identifier.
    void
    createNamespace(String[] namespace, Map<String,String> metadata)
     
    org.apache.spark.sql.connector.catalog.Table
    createTable(org.apache.spark.sql.connector.catalog.Identifier ident, org.apache.spark.sql.types.StructType schema, org.apache.spark.sql.connector.expressions.Transform[] transforms, Map<String,String> properties)
     
    org.apache.spark.sql.connector.catalog.View
    createView(org.apache.spark.sql.connector.catalog.Identifier ident, String sql, String currentCatalog, String[] currentNamespace, org.apache.spark.sql.types.StructType schema, String[] queryColumnNames, String[] columnAliases, String[] columnComments, Map<String,String> properties)
     
     
    boolean
    dropNamespace(String[] namespace, boolean cascade)
     
    boolean
    dropTable(org.apache.spark.sql.connector.catalog.Identifier ident)
     
    boolean
    dropView(org.apache.spark.sql.connector.catalog.Identifier ident)
     
    Returns the underlying Catalog backing this Spark Catalog
    final void
    initialize(String name, org.apache.spark.sql.util.CaseInsensitiveStringMap options)
     
    void
    invalidateTable(org.apache.spark.sql.connector.catalog.Identifier ident)
     
    boolean
     
    boolean
     
    default org.apache.spark.sql.connector.catalog.Identifier[]
    listFunctions(String[] namespace)
     
    String[][]
     
    String[][]
    listNamespaces(String[] namespace)
     
    org.apache.spark.sql.connector.catalog.Identifier[]
    listTables(String[] namespace)
     
    org.apache.spark.sql.connector.catalog.Identifier[]
    listViews(String... namespace)
     
    default org.apache.spark.sql.connector.catalog.functions.UnboundFunction
    loadFunction(org.apache.spark.sql.connector.catalog.Identifier ident)
     
     
    loadProcedure(org.apache.spark.sql.connector.catalog.Identifier ident)
    Load a stored procedure by identifier.
    org.apache.spark.sql.connector.catalog.Table
    loadTable(org.apache.spark.sql.connector.catalog.Identifier ident)
     
    org.apache.spark.sql.connector.catalog.Table
    loadTable(org.apache.spark.sql.connector.catalog.Identifier ident, long timestamp)
     
    org.apache.spark.sql.connector.catalog.Table
    loadTable(org.apache.spark.sql.connector.catalog.Identifier ident, String version)
     
    org.apache.spark.sql.connector.catalog.View
    loadView(org.apache.spark.sql.connector.catalog.Identifier ident)
     
     
    boolean
    purgeTable(org.apache.spark.sql.connector.catalog.Identifier ident)
     
    void
    renameTable(org.apache.spark.sql.connector.catalog.Identifier from, org.apache.spark.sql.connector.catalog.Identifier to)
     
    void
    renameView(org.apache.spark.sql.connector.catalog.Identifier fromIdentifier, org.apache.spark.sql.connector.catalog.Identifier toIdentifier)
     
    org.apache.spark.sql.connector.catalog.View
    replaceView(org.apache.spark.sql.connector.catalog.Identifier ident, String sql, String currentCatalog, String[] currentNamespace, org.apache.spark.sql.types.StructType schema, String[] queryColumnNames, String[] columnAliases, String[] columnComments, Map<String,String> properties)
    Replace a view in the catalog
    org.apache.spark.sql.connector.catalog.StagedTable
    stageCreate(org.apache.spark.sql.connector.catalog.Identifier ident, org.apache.spark.sql.types.StructType schema, org.apache.spark.sql.connector.expressions.Transform[] transforms, Map<String,String> properties)
     
    org.apache.spark.sql.connector.catalog.StagedTable
    stageCreateOrReplace(org.apache.spark.sql.connector.catalog.Identifier ident, org.apache.spark.sql.types.StructType schema, org.apache.spark.sql.connector.expressions.Transform[] transforms, Map<String,String> properties)
     
    org.apache.spark.sql.connector.catalog.StagedTable
    stageReplace(org.apache.spark.sql.connector.catalog.Identifier ident, org.apache.spark.sql.types.StructType schema, org.apache.spark.sql.connector.expressions.Transform[] transforms, Map<String,String> properties)
     
    boolean
     

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

    Methods inherited from interface org.apache.spark.sql.connector.catalog.FunctionCatalog

    functionExists

    Methods inherited from interface org.apache.spark.sql.connector.catalog.StagingTableCatalog

    stageCreate, stageCreateOrReplace, stageReplace

    Methods inherited from interface org.apache.spark.sql.connector.catalog.SupportsNamespaces

    namespaceExists

    Methods inherited from interface org.apache.spark.sql.connector.catalog.TableCatalog

    capabilities, createTable, tableExists

    Methods inherited from interface org.apache.spark.sql.connector.catalog.ViewCatalog

    invalidateView, viewExists
  • Constructor Details

    • SparkCatalog

      public SparkCatalog()
  • Method Details

    • buildIcebergCatalog

      protected Catalog buildIcebergCatalog(String name, org.apache.spark.sql.util.CaseInsensitiveStringMap options)
      Build an Iceberg Catalog to be used by this Spark catalog adapter.
      Parameters:
      name - Spark's catalog name
      options - Spark's catalog options
      Returns:
      an Iceberg catalog
    • buildIdentifier

      protected TableIdentifier buildIdentifier(org.apache.spark.sql.connector.catalog.Identifier identifier)
      Build an Iceberg TableIdentifier for the given Spark identifier.
      Parameters:
      identifier - Spark's identifier
      Returns:
      an Iceberg identifier
    • loadTable

      public org.apache.spark.sql.connector.catalog.Table loadTable(org.apache.spark.sql.connector.catalog.Identifier ident) throws org.apache.spark.sql.catalyst.analysis.NoSuchTableException
      Specified by:
      loadTable in interface org.apache.spark.sql.connector.catalog.TableCatalog
      Throws:
      org.apache.spark.sql.catalyst.analysis.NoSuchTableException
    • loadTable

      public org.apache.spark.sql.connector.catalog.Table loadTable(org.apache.spark.sql.connector.catalog.Identifier ident, String version) throws org.apache.spark.sql.catalyst.analysis.NoSuchTableException
      Specified by:
      loadTable in interface org.apache.spark.sql.connector.catalog.TableCatalog
      Throws:
      org.apache.spark.sql.catalyst.analysis.NoSuchTableException
    • loadTable

      public org.apache.spark.sql.connector.catalog.Table loadTable(org.apache.spark.sql.connector.catalog.Identifier ident, long timestamp) throws org.apache.spark.sql.catalyst.analysis.NoSuchTableException
      Specified by:
      loadTable in interface org.apache.spark.sql.connector.catalog.TableCatalog
      Throws:
      org.apache.spark.sql.catalyst.analysis.NoSuchTableException
    • createTable

      public org.apache.spark.sql.connector.catalog.Table createTable(org.apache.spark.sql.connector.catalog.Identifier ident, org.apache.spark.sql.types.StructType schema, org.apache.spark.sql.connector.expressions.Transform[] transforms, Map<String,String> properties) throws org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException
      Specified by:
      createTable in interface org.apache.spark.sql.connector.catalog.TableCatalog
      Throws:
      org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException
    • stageCreate

      public org.apache.spark.sql.connector.catalog.StagedTable stageCreate(org.apache.spark.sql.connector.catalog.Identifier ident, org.apache.spark.sql.types.StructType schema, org.apache.spark.sql.connector.expressions.Transform[] transforms, Map<String,String> properties) throws org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException
      Specified by:
      stageCreate in interface org.apache.spark.sql.connector.catalog.StagingTableCatalog
      Throws:
      org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException
    • stageReplace

      public org.apache.spark.sql.connector.catalog.StagedTable stageReplace(org.apache.spark.sql.connector.catalog.Identifier ident, org.apache.spark.sql.types.StructType schema, org.apache.spark.sql.connector.expressions.Transform[] transforms, Map<String,String> properties) throws org.apache.spark.sql.catalyst.analysis.NoSuchTableException
      Specified by:
      stageReplace in interface org.apache.spark.sql.connector.catalog.StagingTableCatalog
      Throws:
      org.apache.spark.sql.catalyst.analysis.NoSuchTableException
    • stageCreateOrReplace

      public org.apache.spark.sql.connector.catalog.StagedTable stageCreateOrReplace(org.apache.spark.sql.connector.catalog.Identifier ident, org.apache.spark.sql.types.StructType schema, org.apache.spark.sql.connector.expressions.Transform[] transforms, Map<String,String> properties)
      Specified by:
      stageCreateOrReplace in interface org.apache.spark.sql.connector.catalog.StagingTableCatalog
    • alterTable

      public org.apache.spark.sql.connector.catalog.Table alterTable(org.apache.spark.sql.connector.catalog.Identifier ident, org.apache.spark.sql.connector.catalog.TableChange... changes) throws org.apache.spark.sql.catalyst.analysis.NoSuchTableException
      Specified by:
      alterTable in interface org.apache.spark.sql.connector.catalog.TableCatalog
      Throws:
      org.apache.spark.sql.catalyst.analysis.NoSuchTableException
    • dropTable

      public boolean dropTable(org.apache.spark.sql.connector.catalog.Identifier ident)
      Specified by:
      dropTable in interface org.apache.spark.sql.connector.catalog.TableCatalog
    • purgeTable

      public boolean purgeTable(org.apache.spark.sql.connector.catalog.Identifier ident)
      Specified by:
      purgeTable in interface org.apache.spark.sql.connector.catalog.TableCatalog
    • renameTable

      public void renameTable(org.apache.spark.sql.connector.catalog.Identifier from, org.apache.spark.sql.connector.catalog.Identifier to) throws org.apache.spark.sql.catalyst.analysis.NoSuchTableException, org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException
      Specified by:
      renameTable in interface org.apache.spark.sql.connector.catalog.TableCatalog
      Throws:
      org.apache.spark.sql.catalyst.analysis.NoSuchTableException
      org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException
    • invalidateTable

      public void invalidateTable(org.apache.spark.sql.connector.catalog.Identifier ident)
      Specified by:
      invalidateTable in interface org.apache.spark.sql.connector.catalog.TableCatalog
    • listTables

      public org.apache.spark.sql.connector.catalog.Identifier[] listTables(String[] namespace)
      Specified by:
      listTables in interface org.apache.spark.sql.connector.catalog.TableCatalog
    • defaultNamespace

      public String[] defaultNamespace()
      Specified by:
      defaultNamespace in interface org.apache.spark.sql.connector.catalog.CatalogPlugin
    • listNamespaces

      public String[][] listNamespaces()
      Specified by:
      listNamespaces in interface org.apache.spark.sql.connector.catalog.SupportsNamespaces
    • listNamespaces

      public String[][] listNamespaces(String[] namespace) throws org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException
      Specified by:
      listNamespaces in interface org.apache.spark.sql.connector.catalog.SupportsNamespaces
      Throws:
      org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException
    • loadNamespaceMetadata

      public Map<String,String> loadNamespaceMetadata(String[] namespace) throws org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException
      Specified by:
      loadNamespaceMetadata in interface org.apache.spark.sql.connector.catalog.SupportsNamespaces
      Throws:
      org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException
    • createNamespace

      public void createNamespace(String[] namespace, Map<String,String> metadata) throws org.apache.spark.sql.catalyst.analysis.NamespaceAlreadyExistsException
      Specified by:
      createNamespace in interface org.apache.spark.sql.connector.catalog.SupportsNamespaces
      Throws:
      org.apache.spark.sql.catalyst.analysis.NamespaceAlreadyExistsException
    • alterNamespace

      public void alterNamespace(String[] namespace, org.apache.spark.sql.connector.catalog.NamespaceChange... changes) throws org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException
      Specified by:
      alterNamespace in interface org.apache.spark.sql.connector.catalog.SupportsNamespaces
      Throws:
      org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException
    • dropNamespace

      public boolean dropNamespace(String[] namespace, boolean cascade) throws org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException
      Specified by:
      dropNamespace in interface org.apache.spark.sql.connector.catalog.SupportsNamespaces
      Throws:
      org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException
    • listViews

      public org.apache.spark.sql.connector.catalog.Identifier[] listViews(String... namespace)
      Specified by:
      listViews in interface org.apache.spark.sql.connector.catalog.ViewCatalog
    • loadView

      public org.apache.spark.sql.connector.catalog.View loadView(org.apache.spark.sql.connector.catalog.Identifier ident) throws org.apache.spark.sql.catalyst.analysis.NoSuchViewException
      Specified by:
      loadView in interface org.apache.spark.sql.connector.catalog.ViewCatalog
      Throws:
      org.apache.spark.sql.catalyst.analysis.NoSuchViewException
    • createView

      public org.apache.spark.sql.connector.catalog.View createView(org.apache.spark.sql.connector.catalog.Identifier ident, String sql, String currentCatalog, String[] currentNamespace, org.apache.spark.sql.types.StructType schema, String[] queryColumnNames, String[] columnAliases, String[] columnComments, Map<String,String> properties) throws org.apache.spark.sql.catalyst.analysis.ViewAlreadyExistsException, org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException
      Specified by:
      createView in interface org.apache.spark.sql.connector.catalog.ViewCatalog
      Throws:
      org.apache.spark.sql.catalyst.analysis.ViewAlreadyExistsException
      org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException
    • replaceView

      public org.apache.spark.sql.connector.catalog.View replaceView(org.apache.spark.sql.connector.catalog.Identifier ident, String sql, String currentCatalog, String[] currentNamespace, org.apache.spark.sql.types.StructType schema, String[] queryColumnNames, String[] columnAliases, String[] columnComments, Map<String,String> properties) throws org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException, org.apache.spark.sql.catalyst.analysis.NoSuchViewException
      Description copied from interface: SupportsReplaceView
      Replace a view in the catalog
      Specified by:
      replaceView in interface SupportsReplaceView
      Parameters:
      ident - a view identifier
      sql - the SQL text that defines the view
      currentCatalog - the current catalog
      currentNamespace - the current namespace
      schema - the view query output schema
      queryColumnNames - the query column names
      columnAliases - the column aliases
      columnComments - the column comments
      properties - the view properties
      Throws:
      org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException - If the identifier namespace does not exist (optional)
      org.apache.spark.sql.catalyst.analysis.NoSuchViewException - If the view doesn't exist or is a table
    • alterView

      public org.apache.spark.sql.connector.catalog.View alterView(org.apache.spark.sql.connector.catalog.Identifier ident, org.apache.spark.sql.connector.catalog.ViewChange... changes) throws org.apache.spark.sql.catalyst.analysis.NoSuchViewException, IllegalArgumentException
      Specified by:
      alterView in interface org.apache.spark.sql.connector.catalog.ViewCatalog
      Throws:
      org.apache.spark.sql.catalyst.analysis.NoSuchViewException
      IllegalArgumentException
    • dropView

      public boolean dropView(org.apache.spark.sql.connector.catalog.Identifier ident)
      Specified by:
      dropView in interface org.apache.spark.sql.connector.catalog.ViewCatalog
    • renameView

      public void renameView(org.apache.spark.sql.connector.catalog.Identifier fromIdentifier, org.apache.spark.sql.connector.catalog.Identifier toIdentifier) throws org.apache.spark.sql.catalyst.analysis.NoSuchViewException, org.apache.spark.sql.catalyst.analysis.ViewAlreadyExistsException
      Specified by:
      renameView in interface org.apache.spark.sql.connector.catalog.ViewCatalog
      Throws:
      org.apache.spark.sql.catalyst.analysis.NoSuchViewException
      org.apache.spark.sql.catalyst.analysis.ViewAlreadyExistsException
    • initialize

      public final void initialize(String name, org.apache.spark.sql.util.CaseInsensitiveStringMap options)
      Specified by:
      initialize in interface org.apache.spark.sql.connector.catalog.CatalogPlugin
    • name

      public String name()
      Specified by:
      name in interface org.apache.spark.sql.connector.catalog.CatalogPlugin
    • icebergCatalog

      public Catalog icebergCatalog()
      Description copied from interface: HasIcebergCatalog
      Returns the underlying Catalog backing this Spark Catalog
      Specified by:
      icebergCatalog in interface HasIcebergCatalog
    • loadProcedure

      public Procedure loadProcedure(org.apache.spark.sql.connector.catalog.Identifier ident) throws NoSuchProcedureException
      Description copied from interface: ProcedureCatalog
      Load a stored procedure by identifier.
      Specified by:
      loadProcedure in interface ProcedureCatalog
      Parameters:
      ident - a stored procedure identifier
      Returns:
      the stored procedure's metadata
      Throws:
      NoSuchProcedureException - if there is no matching stored procedure
    • isFunctionNamespace

      public boolean isFunctionNamespace(String[] namespace)
    • isExistingNamespace

      public boolean isExistingNamespace(String[] namespace)
    • useNullableQuerySchema

      public boolean useNullableQuerySchema()
      Specified by:
      useNullableQuerySchema in interface org.apache.spark.sql.connector.catalog.TableCatalog
    • listFunctions

      default org.apache.spark.sql.connector.catalog.Identifier[] listFunctions(String[] namespace) throws org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException
      Specified by:
      listFunctions in interface org.apache.spark.sql.connector.catalog.FunctionCatalog
      Throws:
      org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException
    • loadFunction

      default org.apache.spark.sql.connector.catalog.functions.UnboundFunction loadFunction(org.apache.spark.sql.connector.catalog.Identifier ident) throws org.apache.spark.sql.catalyst.analysis.NoSuchFunctionException
      Specified by:
      loadFunction in interface org.apache.spark.sql.connector.catalog.FunctionCatalog
      Throws:
      org.apache.spark.sql.catalyst.analysis.NoSuchFunctionException