Class SparkCatalog

  • All Implemented Interfaces:
    HasIcebergCatalog, org.apache.spark.sql.connector.catalog.CatalogPlugin, org.apache.spark.sql.connector.catalog.StagingTableCatalog, org.apache.spark.sql.connector.catalog.SupportsNamespaces, org.apache.spark.sql.connector.catalog.TableCatalog, ProcedureCatalog

    public class SparkCatalog
    extends java.lang.Object
    A Spark TableCatalog implementation that wraps an Iceberg Catalog.

    This supports the following catalog configuration options:

    • type - catalog type, "hive" or "hadoop"
    • uri - the Hive Metastore URI (Hive catalog only)
    • warehouse - the warehouse path (Hadoop catalog only)
    • default-namespace - a namespace to use as the default
    • cache-enabled - whether to enable catalog cache
    • cache.expiration-interval-ms - interval in millis before expiring tables from catalog cache. Refer to CatalogProperties.CACHE_EXPIRATION_INTERVAL_MS for further details and significant values.

    To use a custom catalog that is not a Hive or Hadoop catalog, extend this class and override buildIcebergCatalog(String, CaseInsensitiveStringMap).

    • Field Summary

      • Fields inherited from interface org.apache.spark.sql.connector.catalog.SupportsNamespaces

        PROP_COMMENT, PROP_LOCATION, PROP_OWNER
      • Fields inherited from interface org.apache.spark.sql.connector.catalog.TableCatalog

        OPTION_PREFIX, PROP_COMMENT, PROP_EXTERNAL, PROP_IS_MANAGED_LOCATION, PROP_LOCATION, PROP_OWNER, PROP_PROVIDER
    • Constructor Summary

      Constructors 
      Constructor Description
      SparkCatalog()  
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      void alterNamespace​(java.lang.String[] namespace, org.apache.spark.sql.connector.catalog.NamespaceChange... changes)  
      SparkTable alterTable​(org.apache.spark.sql.connector.catalog.Identifier ident, org.apache.spark.sql.connector.catalog.TableChange... changes)  
      protected Catalog buildIcebergCatalog​(java.lang.String name, org.apache.spark.sql.util.CaseInsensitiveStringMap options)
      Build an Iceberg Catalog to be used by this Spark catalog adapter.
      protected TableIdentifier buildIdentifier​(org.apache.spark.sql.connector.catalog.Identifier identifier)
      Build an Iceberg TableIdentifier for the given Spark identifier.
      void createNamespace​(java.lang.String[] namespace, java.util.Map<java.lang.String,​java.lang.String> metadata)  
      SparkTable createTable​(org.apache.spark.sql.connector.catalog.Identifier ident, org.apache.spark.sql.types.StructType schema, org.apache.spark.sql.connector.expressions.Transform[] transforms, java.util.Map<java.lang.String,​java.lang.String> properties)  
      java.lang.String[] defaultNamespace()  
      boolean dropNamespace​(java.lang.String[] namespace, boolean cascade)  
      boolean dropTable​(org.apache.spark.sql.connector.catalog.Identifier ident)  
      Catalog icebergCatalog()
      Returns the underlying Catalog backing this Spark Catalog
      void initialize​(java.lang.String name, org.apache.spark.sql.util.CaseInsensitiveStringMap options)  
      void invalidateTable​(org.apache.spark.sql.connector.catalog.Identifier ident)  
      java.lang.String[][] listNamespaces()  
      java.lang.String[][] listNamespaces​(java.lang.String[] namespace)  
      org.apache.spark.sql.connector.catalog.Identifier[] listTables​(java.lang.String[] namespace)  
      java.util.Map<java.lang.String,​java.lang.String> loadNamespaceMetadata​(java.lang.String[] namespace)  
      Procedure loadProcedure​(org.apache.spark.sql.connector.catalog.Identifier ident)
      Load a stored procedure by identifier.
      SparkTable loadTable​(org.apache.spark.sql.connector.catalog.Identifier ident)  
      SparkTable loadTable​(org.apache.spark.sql.connector.catalog.Identifier ident, long timestamp)  
      SparkTable loadTable​(org.apache.spark.sql.connector.catalog.Identifier ident, java.lang.String version)  
      java.lang.String name()  
      boolean purgeTable​(org.apache.spark.sql.connector.catalog.Identifier ident)  
      void renameTable​(org.apache.spark.sql.connector.catalog.Identifier from, org.apache.spark.sql.connector.catalog.Identifier to)  
      org.apache.spark.sql.connector.catalog.StagedTable stageCreate​(org.apache.spark.sql.connector.catalog.Identifier ident, org.apache.spark.sql.types.StructType schema, org.apache.spark.sql.connector.expressions.Transform[] transforms, java.util.Map<java.lang.String,​java.lang.String> properties)  
      org.apache.spark.sql.connector.catalog.StagedTable stageCreateOrReplace​(org.apache.spark.sql.connector.catalog.Identifier ident, org.apache.spark.sql.types.StructType schema, org.apache.spark.sql.connector.expressions.Transform[] transforms, java.util.Map<java.lang.String,​java.lang.String> properties)  
      org.apache.spark.sql.connector.catalog.StagedTable stageReplace​(org.apache.spark.sql.connector.catalog.Identifier ident, org.apache.spark.sql.types.StructType schema, org.apache.spark.sql.connector.expressions.Transform[] transforms, java.util.Map<java.lang.String,​java.lang.String> properties)  
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
      • Methods inherited from interface org.apache.spark.sql.connector.catalog.SupportsNamespaces

        namespaceExists
      • Methods inherited from interface org.apache.spark.sql.connector.catalog.TableCatalog

        tableExists
    • Constructor Detail

      • SparkCatalog

        public SparkCatalog()
    • Method Detail

      • buildIcebergCatalog

        protected Catalog buildIcebergCatalog​(java.lang.String name,
                                              org.apache.spark.sql.util.CaseInsensitiveStringMap options)
        Build an Iceberg Catalog to be used by this Spark catalog adapter.
        Parameters:
        name - Spark's catalog name
        options - Spark's catalog options
        Returns:
        an Iceberg catalog
      • buildIdentifier

        protected TableIdentifier buildIdentifier​(org.apache.spark.sql.connector.catalog.Identifier identifier)
        Build an Iceberg TableIdentifier for the given Spark identifier.
        Parameters:
        identifier - Spark's identifier
        Returns:
        an Iceberg identifier
      • loadTable

        public SparkTable loadTable​(org.apache.spark.sql.connector.catalog.Identifier ident)
                             throws org.apache.spark.sql.catalyst.analysis.NoSuchTableException
        Throws:
        org.apache.spark.sql.catalyst.analysis.NoSuchTableException
      • loadTable

        public SparkTable loadTable​(org.apache.spark.sql.connector.catalog.Identifier ident,
                                    java.lang.String version)
                             throws org.apache.spark.sql.catalyst.analysis.NoSuchTableException
        Throws:
        org.apache.spark.sql.catalyst.analysis.NoSuchTableException
      • loadTable

        public SparkTable loadTable​(org.apache.spark.sql.connector.catalog.Identifier ident,
                                    long timestamp)
                             throws org.apache.spark.sql.catalyst.analysis.NoSuchTableException
        Throws:
        org.apache.spark.sql.catalyst.analysis.NoSuchTableException
      • createTable

        public SparkTable createTable​(org.apache.spark.sql.connector.catalog.Identifier ident,
                                      org.apache.spark.sql.types.StructType schema,
                                      org.apache.spark.sql.connector.expressions.Transform[] transforms,
                                      java.util.Map<java.lang.String,​java.lang.String> properties)
                               throws org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException
        Throws:
        org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException
      • stageCreate

        public org.apache.spark.sql.connector.catalog.StagedTable stageCreate​(org.apache.spark.sql.connector.catalog.Identifier ident,
                                                                              org.apache.spark.sql.types.StructType schema,
                                                                              org.apache.spark.sql.connector.expressions.Transform[] transforms,
                                                                              java.util.Map<java.lang.String,​java.lang.String> properties)
                                                                       throws org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException
        Throws:
        org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException
      • stageReplace

        public org.apache.spark.sql.connector.catalog.StagedTable stageReplace​(org.apache.spark.sql.connector.catalog.Identifier ident,
                                                                               org.apache.spark.sql.types.StructType schema,
                                                                               org.apache.spark.sql.connector.expressions.Transform[] transforms,
                                                                               java.util.Map<java.lang.String,​java.lang.String> properties)
                                                                        throws org.apache.spark.sql.catalyst.analysis.NoSuchTableException
        Throws:
        org.apache.spark.sql.catalyst.analysis.NoSuchTableException
      • stageCreateOrReplace

        public org.apache.spark.sql.connector.catalog.StagedTable stageCreateOrReplace​(org.apache.spark.sql.connector.catalog.Identifier ident,
                                                                                       org.apache.spark.sql.types.StructType schema,
                                                                                       org.apache.spark.sql.connector.expressions.Transform[] transforms,
                                                                                       java.util.Map<java.lang.String,​java.lang.String> properties)
      • alterTable

        public SparkTable alterTable​(org.apache.spark.sql.connector.catalog.Identifier ident,
                                     org.apache.spark.sql.connector.catalog.TableChange... changes)
                              throws org.apache.spark.sql.catalyst.analysis.NoSuchTableException
        Throws:
        org.apache.spark.sql.catalyst.analysis.NoSuchTableException
      • dropTable

        public boolean dropTable​(org.apache.spark.sql.connector.catalog.Identifier ident)
      • purgeTable

        public boolean purgeTable​(org.apache.spark.sql.connector.catalog.Identifier ident)
      • renameTable

        public void renameTable​(org.apache.spark.sql.connector.catalog.Identifier from,
                                org.apache.spark.sql.connector.catalog.Identifier to)
                         throws org.apache.spark.sql.catalyst.analysis.NoSuchTableException,
                                org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException
        Throws:
        org.apache.spark.sql.catalyst.analysis.NoSuchTableException
        org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException
      • invalidateTable

        public void invalidateTable​(org.apache.spark.sql.connector.catalog.Identifier ident)
      • listTables

        public org.apache.spark.sql.connector.catalog.Identifier[] listTables​(java.lang.String[] namespace)
      • defaultNamespace

        public java.lang.String[] defaultNamespace()
      • listNamespaces

        public java.lang.String[][] listNamespaces()
      • listNamespaces

        public java.lang.String[][] listNamespaces​(java.lang.String[] namespace)
                                            throws org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException
        Throws:
        org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException
      • loadNamespaceMetadata

        public java.util.Map<java.lang.String,​java.lang.String> loadNamespaceMetadata​(java.lang.String[] namespace)
                                                                                     throws org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException
        Throws:
        org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException
      • createNamespace

        public void createNamespace​(java.lang.String[] namespace,
                                    java.util.Map<java.lang.String,​java.lang.String> metadata)
                             throws org.apache.spark.sql.catalyst.analysis.NamespaceAlreadyExistsException
        Throws:
        org.apache.spark.sql.catalyst.analysis.NamespaceAlreadyExistsException
      • alterNamespace

        public void alterNamespace​(java.lang.String[] namespace,
                                   org.apache.spark.sql.connector.catalog.NamespaceChange... changes)
                            throws org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException
        Throws:
        org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException
      • dropNamespace

        public boolean dropNamespace​(java.lang.String[] namespace,
                                     boolean cascade)
                              throws org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException
        Throws:
        org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException
      • initialize

        public final void initialize​(java.lang.String name,
                                     org.apache.spark.sql.util.CaseInsensitiveStringMap options)
      • name

        public java.lang.String name()
      • icebergCatalog

        public Catalog icebergCatalog()
        Description copied from interface: HasIcebergCatalog
        Returns the underlying Catalog backing this Spark Catalog