Package org.apache.iceberg.spark
Class SparkCatalog
java.lang.Object
org.apache.iceberg.spark.SparkCatalog
- All Implemented Interfaces:
HasIcebergCatalog
,SupportsReplaceView
,org.apache.spark.sql.connector.catalog.CatalogPlugin
,org.apache.spark.sql.connector.catalog.FunctionCatalog
,org.apache.spark.sql.connector.catalog.StagingTableCatalog
,org.apache.spark.sql.connector.catalog.SupportsNamespaces
,org.apache.spark.sql.connector.catalog.TableCatalog
,org.apache.spark.sql.connector.catalog.ViewCatalog
,ProcedureCatalog
public class SparkCatalog
extends Object
implements org.apache.spark.sql.connector.catalog.ViewCatalog, SupportsReplaceView
A Spark TableCatalog implementation that wraps an Iceberg
Catalog
.
This supports the following catalog configuration options:
type
- catalog type, "hive" or "hadoop" or "rest". To specify a non-hive or hadoop catalog, use thecatalog-impl
option.uri
- the Hive Metastore URI for Hive catalog or REST URI for REST catalogwarehouse
- the warehouse path (Hadoop catalog only)catalog-impl
- a customCatalog
implementation to useio-impl
- a customFileIO
implementation to usemetrics-reporter-impl
- a customMetricsReporter
implementation to usedefault-namespace
- a namespace to use as the defaultcache-enabled
- whether to enable catalog cachecache.case-sensitive
- whether the catalog cache should compare table identifiers in a case sensitive waycache.expiration-interval-ms
- interval in millis before expiring tables from catalog cache. Refer toCatalogProperties.CACHE_EXPIRATION_INTERVAL_MS
for further details and significant values.table-default.$tablePropertyKey
- table property $tablePropertyKey default at catalog leveltable-override.$tablePropertyKey
- table property $tablePropertyKey enforced at catalog level
-
Field Summary
Fields inherited from interface org.apache.spark.sql.connector.catalog.SupportsNamespaces
PROP_COMMENT, PROP_LOCATION, PROP_OWNER
Fields inherited from interface org.apache.spark.sql.connector.catalog.TableCatalog
OPTION_PREFIX, PROP_COMMENT, PROP_EXTERNAL, PROP_IS_MANAGED_LOCATION, PROP_LOCATION, PROP_OWNER, PROP_PROVIDER
Fields inherited from interface org.apache.spark.sql.connector.catalog.ViewCatalog
PROP_COMMENT, PROP_CREATE_ENGINE_VERSION, PROP_ENGINE_VERSION, PROP_OWNER, RESERVED_PROPERTIES
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionvoid
alterNamespace
(String[] namespace, org.apache.spark.sql.connector.catalog.NamespaceChange... changes) org.apache.spark.sql.connector.catalog.Table
alterTable
(org.apache.spark.sql.connector.catalog.Identifier ident, org.apache.spark.sql.connector.catalog.TableChange... changes) org.apache.spark.sql.connector.catalog.View
alterView
(org.apache.spark.sql.connector.catalog.Identifier ident, org.apache.spark.sql.connector.catalog.ViewChange... changes) protected Catalog
buildIcebergCatalog
(String name, org.apache.spark.sql.util.CaseInsensitiveStringMap options) Build an IcebergCatalog
to be used by this Spark catalog adapter.protected TableIdentifier
buildIdentifier
(org.apache.spark.sql.connector.catalog.Identifier identifier) Build an IcebergTableIdentifier
for the given Spark identifier.void
createNamespace
(String[] namespace, Map<String, String> metadata) org.apache.spark.sql.connector.catalog.Table
createTable
(org.apache.spark.sql.connector.catalog.Identifier ident, org.apache.spark.sql.types.StructType schema, org.apache.spark.sql.connector.expressions.Transform[] transforms, Map<String, String> properties) org.apache.spark.sql.connector.catalog.View
createView
(org.apache.spark.sql.connector.catalog.Identifier ident, String sql, String currentCatalog, String[] currentNamespace, org.apache.spark.sql.types.StructType schema, String[] queryColumnNames, String[] columnAliases, String[] columnComments, Map<String, String> properties) String[]
boolean
dropNamespace
(String[] namespace, boolean cascade) boolean
dropTable
(org.apache.spark.sql.connector.catalog.Identifier ident) boolean
dropView
(org.apache.spark.sql.connector.catalog.Identifier ident) Returns the underlyingCatalog
backing this Spark Catalogfinal void
initialize
(String name, org.apache.spark.sql.util.CaseInsensitiveStringMap options) void
invalidateTable
(org.apache.spark.sql.connector.catalog.Identifier ident) boolean
isExistingNamespace
(String[] namespace) boolean
isFunctionNamespace
(String[] namespace) default org.apache.spark.sql.connector.catalog.Identifier[]
listFunctions
(String[] namespace) String[][]
String[][]
listNamespaces
(String[] namespace) org.apache.spark.sql.connector.catalog.Identifier[]
listTables
(String[] namespace) org.apache.spark.sql.connector.catalog.Identifier[]
default org.apache.spark.sql.connector.catalog.functions.UnboundFunction
loadFunction
(org.apache.spark.sql.connector.catalog.Identifier ident) loadNamespaceMetadata
(String[] namespace) loadProcedure
(org.apache.spark.sql.connector.catalog.Identifier ident) Load astored procedure
byidentifier
.org.apache.spark.sql.connector.catalog.Table
loadTable
(org.apache.spark.sql.connector.catalog.Identifier ident) org.apache.spark.sql.connector.catalog.Table
loadTable
(org.apache.spark.sql.connector.catalog.Identifier ident, long timestamp) org.apache.spark.sql.connector.catalog.Table
org.apache.spark.sql.connector.catalog.View
loadView
(org.apache.spark.sql.connector.catalog.Identifier ident) name()
boolean
purgeTable
(org.apache.spark.sql.connector.catalog.Identifier ident) void
renameTable
(org.apache.spark.sql.connector.catalog.Identifier from, org.apache.spark.sql.connector.catalog.Identifier to) void
renameView
(org.apache.spark.sql.connector.catalog.Identifier fromIdentifier, org.apache.spark.sql.connector.catalog.Identifier toIdentifier) org.apache.spark.sql.connector.catalog.View
replaceView
(org.apache.spark.sql.connector.catalog.Identifier ident, String sql, String currentCatalog, String[] currentNamespace, org.apache.spark.sql.types.StructType schema, String[] queryColumnNames, String[] columnAliases, String[] columnComments, Map<String, String> properties) Replace a view in the catalogorg.apache.spark.sql.connector.catalog.StagedTable
stageCreate
(org.apache.spark.sql.connector.catalog.Identifier ident, org.apache.spark.sql.types.StructType schema, org.apache.spark.sql.connector.expressions.Transform[] transforms, Map<String, String> properties) org.apache.spark.sql.connector.catalog.StagedTable
stageCreateOrReplace
(org.apache.spark.sql.connector.catalog.Identifier ident, org.apache.spark.sql.types.StructType schema, org.apache.spark.sql.connector.expressions.Transform[] transforms, Map<String, String> properties) org.apache.spark.sql.connector.catalog.StagedTable
stageReplace
(org.apache.spark.sql.connector.catalog.Identifier ident, org.apache.spark.sql.types.StructType schema, org.apache.spark.sql.connector.expressions.Transform[] transforms, Map<String, String> properties) boolean
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface org.apache.spark.sql.connector.catalog.FunctionCatalog
functionExists
Methods inherited from interface org.apache.spark.sql.connector.catalog.StagingTableCatalog
stageCreate, stageCreateOrReplace, stageReplace
Methods inherited from interface org.apache.spark.sql.connector.catalog.SupportsNamespaces
namespaceExists
Methods inherited from interface org.apache.spark.sql.connector.catalog.TableCatalog
capabilities, createTable, tableExists
Methods inherited from interface org.apache.spark.sql.connector.catalog.ViewCatalog
invalidateView, viewExists
-
Constructor Details
-
SparkCatalog
public SparkCatalog()
-
-
Method Details
-
buildIcebergCatalog
protected Catalog buildIcebergCatalog(String name, org.apache.spark.sql.util.CaseInsensitiveStringMap options) Build an IcebergCatalog
to be used by this Spark catalog adapter.- Parameters:
name
- Spark's catalog nameoptions
- Spark's catalog options- Returns:
- an Iceberg catalog
-
buildIdentifier
protected TableIdentifier buildIdentifier(org.apache.spark.sql.connector.catalog.Identifier identifier) Build an IcebergTableIdentifier
for the given Spark identifier.- Parameters:
identifier
- Spark's identifier- Returns:
- an Iceberg identifier
-
loadTable
public org.apache.spark.sql.connector.catalog.Table loadTable(org.apache.spark.sql.connector.catalog.Identifier ident) throws org.apache.spark.sql.catalyst.analysis.NoSuchTableException - Specified by:
loadTable
in interfaceorg.apache.spark.sql.connector.catalog.TableCatalog
- Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchTableException
-
loadTable
public org.apache.spark.sql.connector.catalog.Table loadTable(org.apache.spark.sql.connector.catalog.Identifier ident, String version) throws org.apache.spark.sql.catalyst.analysis.NoSuchTableException - Specified by:
loadTable
in interfaceorg.apache.spark.sql.connector.catalog.TableCatalog
- Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchTableException
-
loadTable
public org.apache.spark.sql.connector.catalog.Table loadTable(org.apache.spark.sql.connector.catalog.Identifier ident, long timestamp) throws org.apache.spark.sql.catalyst.analysis.NoSuchTableException - Specified by:
loadTable
in interfaceorg.apache.spark.sql.connector.catalog.TableCatalog
- Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchTableException
-
createTable
public org.apache.spark.sql.connector.catalog.Table createTable(org.apache.spark.sql.connector.catalog.Identifier ident, org.apache.spark.sql.types.StructType schema, org.apache.spark.sql.connector.expressions.Transform[] transforms, Map<String, String> properties) throws org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException- Specified by:
createTable
in interfaceorg.apache.spark.sql.connector.catalog.TableCatalog
- Throws:
org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException
-
stageCreate
public org.apache.spark.sql.connector.catalog.StagedTable stageCreate(org.apache.spark.sql.connector.catalog.Identifier ident, org.apache.spark.sql.types.StructType schema, org.apache.spark.sql.connector.expressions.Transform[] transforms, Map<String, String> properties) throws org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException- Specified by:
stageCreate
in interfaceorg.apache.spark.sql.connector.catalog.StagingTableCatalog
- Throws:
org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException
-
stageReplace
public org.apache.spark.sql.connector.catalog.StagedTable stageReplace(org.apache.spark.sql.connector.catalog.Identifier ident, org.apache.spark.sql.types.StructType schema, org.apache.spark.sql.connector.expressions.Transform[] transforms, Map<String, String> properties) throws org.apache.spark.sql.catalyst.analysis.NoSuchTableException- Specified by:
stageReplace
in interfaceorg.apache.spark.sql.connector.catalog.StagingTableCatalog
- Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchTableException
-
stageCreateOrReplace
public org.apache.spark.sql.connector.catalog.StagedTable stageCreateOrReplace(org.apache.spark.sql.connector.catalog.Identifier ident, org.apache.spark.sql.types.StructType schema, org.apache.spark.sql.connector.expressions.Transform[] transforms, Map<String, String> properties) - Specified by:
stageCreateOrReplace
in interfaceorg.apache.spark.sql.connector.catalog.StagingTableCatalog
-
alterTable
public org.apache.spark.sql.connector.catalog.Table alterTable(org.apache.spark.sql.connector.catalog.Identifier ident, org.apache.spark.sql.connector.catalog.TableChange... changes) throws org.apache.spark.sql.catalyst.analysis.NoSuchTableException - Specified by:
alterTable
in interfaceorg.apache.spark.sql.connector.catalog.TableCatalog
- Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchTableException
-
dropTable
public boolean dropTable(org.apache.spark.sql.connector.catalog.Identifier ident) - Specified by:
dropTable
in interfaceorg.apache.spark.sql.connector.catalog.TableCatalog
-
purgeTable
public boolean purgeTable(org.apache.spark.sql.connector.catalog.Identifier ident) - Specified by:
purgeTable
in interfaceorg.apache.spark.sql.connector.catalog.TableCatalog
-
renameTable
public void renameTable(org.apache.spark.sql.connector.catalog.Identifier from, org.apache.spark.sql.connector.catalog.Identifier to) throws org.apache.spark.sql.catalyst.analysis.NoSuchTableException, org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException - Specified by:
renameTable
in interfaceorg.apache.spark.sql.connector.catalog.TableCatalog
- Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchTableException
org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException
-
invalidateTable
public void invalidateTable(org.apache.spark.sql.connector.catalog.Identifier ident) - Specified by:
invalidateTable
in interfaceorg.apache.spark.sql.connector.catalog.TableCatalog
-
listTables
- Specified by:
listTables
in interfaceorg.apache.spark.sql.connector.catalog.TableCatalog
-
defaultNamespace
- Specified by:
defaultNamespace
in interfaceorg.apache.spark.sql.connector.catalog.CatalogPlugin
-
listNamespaces
- Specified by:
listNamespaces
in interfaceorg.apache.spark.sql.connector.catalog.SupportsNamespaces
-
listNamespaces
public String[][] listNamespaces(String[] namespace) throws org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException - Specified by:
listNamespaces
in interfaceorg.apache.spark.sql.connector.catalog.SupportsNamespaces
- Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException
-
loadNamespaceMetadata
public Map<String,String> loadNamespaceMetadata(String[] namespace) throws org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException - Specified by:
loadNamespaceMetadata
in interfaceorg.apache.spark.sql.connector.catalog.SupportsNamespaces
- Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException
-
createNamespace
public void createNamespace(String[] namespace, Map<String, String> metadata) throws org.apache.spark.sql.catalyst.analysis.NamespaceAlreadyExistsException- Specified by:
createNamespace
in interfaceorg.apache.spark.sql.connector.catalog.SupportsNamespaces
- Throws:
org.apache.spark.sql.catalyst.analysis.NamespaceAlreadyExistsException
-
alterNamespace
public void alterNamespace(String[] namespace, org.apache.spark.sql.connector.catalog.NamespaceChange... changes) throws org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException - Specified by:
alterNamespace
in interfaceorg.apache.spark.sql.connector.catalog.SupportsNamespaces
- Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException
-
dropNamespace
public boolean dropNamespace(String[] namespace, boolean cascade) throws org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException - Specified by:
dropNamespace
in interfaceorg.apache.spark.sql.connector.catalog.SupportsNamespaces
- Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException
-
listViews
- Specified by:
listViews
in interfaceorg.apache.spark.sql.connector.catalog.ViewCatalog
-
loadView
public org.apache.spark.sql.connector.catalog.View loadView(org.apache.spark.sql.connector.catalog.Identifier ident) throws org.apache.spark.sql.catalyst.analysis.NoSuchViewException - Specified by:
loadView
in interfaceorg.apache.spark.sql.connector.catalog.ViewCatalog
- Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchViewException
-
createView
public org.apache.spark.sql.connector.catalog.View createView(org.apache.spark.sql.connector.catalog.Identifier ident, String sql, String currentCatalog, String[] currentNamespace, org.apache.spark.sql.types.StructType schema, String[] queryColumnNames, String[] columnAliases, String[] columnComments, Map<String, String> properties) throws org.apache.spark.sql.catalyst.analysis.ViewAlreadyExistsException, org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException- Specified by:
createView
in interfaceorg.apache.spark.sql.connector.catalog.ViewCatalog
- Throws:
org.apache.spark.sql.catalyst.analysis.ViewAlreadyExistsException
org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException
-
replaceView
public org.apache.spark.sql.connector.catalog.View replaceView(org.apache.spark.sql.connector.catalog.Identifier ident, String sql, String currentCatalog, String[] currentNamespace, org.apache.spark.sql.types.StructType schema, String[] queryColumnNames, String[] columnAliases, String[] columnComments, Map<String, String> properties) throws org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException, org.apache.spark.sql.catalyst.analysis.NoSuchViewExceptionDescription copied from interface:SupportsReplaceView
Replace a view in the catalog- Specified by:
replaceView
in interfaceSupportsReplaceView
- Parameters:
ident
- a view identifiersql
- the SQL text that defines the viewcurrentCatalog
- the current catalogcurrentNamespace
- the current namespaceschema
- the view query output schemaqueryColumnNames
- the query column namescolumnAliases
- the column aliasescolumnComments
- the column commentsproperties
- the view properties- Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException
- If the identifier namespace does not exist (optional)org.apache.spark.sql.catalyst.analysis.NoSuchViewException
- If the view doesn't exist or is a table
-
alterView
public org.apache.spark.sql.connector.catalog.View alterView(org.apache.spark.sql.connector.catalog.Identifier ident, org.apache.spark.sql.connector.catalog.ViewChange... changes) throws org.apache.spark.sql.catalyst.analysis.NoSuchViewException, IllegalArgumentException - Specified by:
alterView
in interfaceorg.apache.spark.sql.connector.catalog.ViewCatalog
- Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchViewException
IllegalArgumentException
-
dropView
public boolean dropView(org.apache.spark.sql.connector.catalog.Identifier ident) - Specified by:
dropView
in interfaceorg.apache.spark.sql.connector.catalog.ViewCatalog
-
renameView
public void renameView(org.apache.spark.sql.connector.catalog.Identifier fromIdentifier, org.apache.spark.sql.connector.catalog.Identifier toIdentifier) throws org.apache.spark.sql.catalyst.analysis.NoSuchViewException, org.apache.spark.sql.catalyst.analysis.ViewAlreadyExistsException - Specified by:
renameView
in interfaceorg.apache.spark.sql.connector.catalog.ViewCatalog
- Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchViewException
org.apache.spark.sql.catalyst.analysis.ViewAlreadyExistsException
-
initialize
public final void initialize(String name, org.apache.spark.sql.util.CaseInsensitiveStringMap options) - Specified by:
initialize
in interfaceorg.apache.spark.sql.connector.catalog.CatalogPlugin
-
name
- Specified by:
name
in interfaceorg.apache.spark.sql.connector.catalog.CatalogPlugin
-
icebergCatalog
Description copied from interface:HasIcebergCatalog
Returns the underlyingCatalog
backing this Spark Catalog- Specified by:
icebergCatalog
in interfaceHasIcebergCatalog
-
loadProcedure
public Procedure loadProcedure(org.apache.spark.sql.connector.catalog.Identifier ident) throws NoSuchProcedureException Description copied from interface:ProcedureCatalog
Load astored procedure
byidentifier
.- Specified by:
loadProcedure
in interfaceProcedureCatalog
- Parameters:
ident
- a stored procedure identifier- Returns:
- the stored procedure's metadata
- Throws:
NoSuchProcedureException
- if there is no matching stored procedure
-
isFunctionNamespace
-
isExistingNamespace
-
useNullableQuerySchema
public boolean useNullableQuerySchema()- Specified by:
useNullableQuerySchema
in interfaceorg.apache.spark.sql.connector.catalog.TableCatalog
-
listFunctions
default org.apache.spark.sql.connector.catalog.Identifier[] listFunctions(String[] namespace) throws org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException - Specified by:
listFunctions
in interfaceorg.apache.spark.sql.connector.catalog.FunctionCatalog
- Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException
-
loadFunction
default org.apache.spark.sql.connector.catalog.functions.UnboundFunction loadFunction(org.apache.spark.sql.connector.catalog.Identifier ident) throws org.apache.spark.sql.catalyst.analysis.NoSuchFunctionException - Specified by:
loadFunction
in interfaceorg.apache.spark.sql.connector.catalog.FunctionCatalog
- Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchFunctionException
-