Package org.apache.iceberg.spark
Class SparkCatalog
- java.lang.Object
-
- org.apache.iceberg.spark.SparkCatalog
-
- All Implemented Interfaces:
HasIcebergCatalog
,org.apache.spark.sql.connector.catalog.CatalogPlugin
,org.apache.spark.sql.connector.catalog.FunctionCatalog
,org.apache.spark.sql.connector.catalog.StagingTableCatalog
,org.apache.spark.sql.connector.catalog.SupportsNamespaces
,org.apache.spark.sql.connector.catalog.TableCatalog
,ProcedureCatalog
public class SparkCatalog extends java.lang.Object
A Spark TableCatalog implementation that wraps an IcebergCatalog
.This supports the following catalog configuration options:
type
- catalog type, "hive" or "hadoop" or "rest". To specify a non-hive or hadoop catalog, use thecatalog-impl
option.uri
- the Hive Metastore URI for Hive catalog or REST URI for REST catalogwarehouse
- the warehouse path (Hadoop catalog only)catalog-impl
- a customCatalog
implementation to useio-impl
- a customFileIO
implementation to usemetrics-reporter-impl
- a customMetricsReporter
implementation to usedefault-namespace
- a namespace to use as the defaultcache-enabled
- whether to enable catalog cachecache.case-sensitive
- whether the catalog cache should compare table identifiers in a case sensitive waycache.expiration-interval-ms
- interval in millis before expiring tables from catalog cache. Refer toCatalogProperties.CACHE_EXPIRATION_INTERVAL_MS
for further details and significant values.table-default.$tablePropertyKey
- table property $tablePropertyKey default at catalog leveltable-override.$tablePropertyKey
- table property $tablePropertyKey enforced at catalog level
-
-
Constructor Summary
Constructors Constructor Description SparkCatalog()
-
Method Summary
All Methods Instance Methods Concrete Methods Default Methods Modifier and Type Method Description void
alterNamespace(java.lang.String[] namespace, org.apache.spark.sql.connector.catalog.NamespaceChange... changes)
org.apache.spark.sql.connector.catalog.Table
alterTable(org.apache.spark.sql.connector.catalog.Identifier ident, org.apache.spark.sql.connector.catalog.TableChange... changes)
protected Catalog
buildIcebergCatalog(java.lang.String name, org.apache.spark.sql.util.CaseInsensitiveStringMap options)
Build an IcebergCatalog
to be used by this Spark catalog adapter.protected TableIdentifier
buildIdentifier(org.apache.spark.sql.connector.catalog.Identifier identifier)
Build an IcebergTableIdentifier
for the given Spark identifier.void
createNamespace(java.lang.String[] namespace, java.util.Map<java.lang.String,java.lang.String> metadata)
org.apache.spark.sql.connector.catalog.Table
createTable(org.apache.spark.sql.connector.catalog.Identifier ident, org.apache.spark.sql.types.StructType schema, org.apache.spark.sql.connector.expressions.Transform[] transforms, java.util.Map<java.lang.String,java.lang.String> properties)
java.lang.String[]
defaultNamespace()
boolean
dropNamespace(java.lang.String[] namespace, boolean cascade)
boolean
dropTable(org.apache.spark.sql.connector.catalog.Identifier ident)
Catalog
icebergCatalog()
Returns the underlyingCatalog
backing this Spark Catalogvoid
initialize(java.lang.String name, org.apache.spark.sql.util.CaseInsensitiveStringMap options)
void
invalidateTable(org.apache.spark.sql.connector.catalog.Identifier ident)
boolean
isExistingNamespace(java.lang.String[] namespace)
boolean
isFunctionNamespace(java.lang.String[] namespace)
default org.apache.spark.sql.connector.catalog.Identifier[]
listFunctions(java.lang.String[] namespace)
java.lang.String[][]
listNamespaces()
java.lang.String[][]
listNamespaces(java.lang.String[] namespace)
org.apache.spark.sql.connector.catalog.Identifier[]
listTables(java.lang.String[] namespace)
default org.apache.spark.sql.connector.catalog.functions.UnboundFunction
loadFunction(org.apache.spark.sql.connector.catalog.Identifier ident)
java.util.Map<java.lang.String,java.lang.String>
loadNamespaceMetadata(java.lang.String[] namespace)
Procedure
loadProcedure(org.apache.spark.sql.connector.catalog.Identifier ident)
Load astored procedure
byidentifier
.org.apache.spark.sql.connector.catalog.Table
loadTable(org.apache.spark.sql.connector.catalog.Identifier ident)
org.apache.spark.sql.connector.catalog.Table
loadTable(org.apache.spark.sql.connector.catalog.Identifier ident, long timestamp)
org.apache.spark.sql.connector.catalog.Table
loadTable(org.apache.spark.sql.connector.catalog.Identifier ident, java.lang.String version)
java.lang.String
name()
boolean
purgeTable(org.apache.spark.sql.connector.catalog.Identifier ident)
void
renameTable(org.apache.spark.sql.connector.catalog.Identifier from, org.apache.spark.sql.connector.catalog.Identifier to)
org.apache.spark.sql.connector.catalog.StagedTable
stageCreate(org.apache.spark.sql.connector.catalog.Identifier ident, org.apache.spark.sql.types.StructType schema, org.apache.spark.sql.connector.expressions.Transform[] transforms, java.util.Map<java.lang.String,java.lang.String> properties)
org.apache.spark.sql.connector.catalog.StagedTable
stageCreateOrReplace(org.apache.spark.sql.connector.catalog.Identifier ident, org.apache.spark.sql.types.StructType schema, org.apache.spark.sql.connector.expressions.Transform[] transforms, java.util.Map<java.lang.String,java.lang.String> properties)
org.apache.spark.sql.connector.catalog.StagedTable
stageReplace(org.apache.spark.sql.connector.catalog.Identifier ident, org.apache.spark.sql.types.StructType schema, org.apache.spark.sql.connector.expressions.Transform[] transforms, java.util.Map<java.lang.String,java.lang.String> properties)
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface org.apache.spark.sql.connector.catalog.FunctionCatalog
functionExists
-
Methods inherited from interface org.apache.spark.sql.connector.catalog.StagingTableCatalog
stageCreate, stageCreateOrReplace, stageReplace
-
-
-
-
Method Detail
-
buildIcebergCatalog
protected Catalog buildIcebergCatalog(java.lang.String name, org.apache.spark.sql.util.CaseInsensitiveStringMap options)
Build an IcebergCatalog
to be used by this Spark catalog adapter.- Parameters:
name
- Spark's catalog nameoptions
- Spark's catalog options- Returns:
- an Iceberg catalog
-
buildIdentifier
protected TableIdentifier buildIdentifier(org.apache.spark.sql.connector.catalog.Identifier identifier)
Build an IcebergTableIdentifier
for the given Spark identifier.- Parameters:
identifier
- Spark's identifier- Returns:
- an Iceberg identifier
-
loadTable
public org.apache.spark.sql.connector.catalog.Table loadTable(org.apache.spark.sql.connector.catalog.Identifier ident) throws org.apache.spark.sql.catalyst.analysis.NoSuchTableException
- Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchTableException
-
loadTable
public org.apache.spark.sql.connector.catalog.Table loadTable(org.apache.spark.sql.connector.catalog.Identifier ident, java.lang.String version) throws org.apache.spark.sql.catalyst.analysis.NoSuchTableException
- Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchTableException
-
loadTable
public org.apache.spark.sql.connector.catalog.Table loadTable(org.apache.spark.sql.connector.catalog.Identifier ident, long timestamp) throws org.apache.spark.sql.catalyst.analysis.NoSuchTableException
- Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchTableException
-
createTable
public org.apache.spark.sql.connector.catalog.Table createTable(org.apache.spark.sql.connector.catalog.Identifier ident, org.apache.spark.sql.types.StructType schema, org.apache.spark.sql.connector.expressions.Transform[] transforms, java.util.Map<java.lang.String,java.lang.String> properties) throws org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException
- Throws:
org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException
-
stageCreate
public org.apache.spark.sql.connector.catalog.StagedTable stageCreate(org.apache.spark.sql.connector.catalog.Identifier ident, org.apache.spark.sql.types.StructType schema, org.apache.spark.sql.connector.expressions.Transform[] transforms, java.util.Map<java.lang.String,java.lang.String> properties) throws org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException
- Throws:
org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException
-
stageReplace
public org.apache.spark.sql.connector.catalog.StagedTable stageReplace(org.apache.spark.sql.connector.catalog.Identifier ident, org.apache.spark.sql.types.StructType schema, org.apache.spark.sql.connector.expressions.Transform[] transforms, java.util.Map<java.lang.String,java.lang.String> properties) throws org.apache.spark.sql.catalyst.analysis.NoSuchTableException
- Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchTableException
-
stageCreateOrReplace
public org.apache.spark.sql.connector.catalog.StagedTable stageCreateOrReplace(org.apache.spark.sql.connector.catalog.Identifier ident, org.apache.spark.sql.types.StructType schema, org.apache.spark.sql.connector.expressions.Transform[] transforms, java.util.Map<java.lang.String,java.lang.String> properties)
-
alterTable
public org.apache.spark.sql.connector.catalog.Table alterTable(org.apache.spark.sql.connector.catalog.Identifier ident, org.apache.spark.sql.connector.catalog.TableChange... changes) throws org.apache.spark.sql.catalyst.analysis.NoSuchTableException
- Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchTableException
-
dropTable
public boolean dropTable(org.apache.spark.sql.connector.catalog.Identifier ident)
-
purgeTable
public boolean purgeTable(org.apache.spark.sql.connector.catalog.Identifier ident)
-
renameTable
public void renameTable(org.apache.spark.sql.connector.catalog.Identifier from, org.apache.spark.sql.connector.catalog.Identifier to) throws org.apache.spark.sql.catalyst.analysis.NoSuchTableException, org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException
- Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchTableException
org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException
-
invalidateTable
public void invalidateTable(org.apache.spark.sql.connector.catalog.Identifier ident)
-
listTables
public org.apache.spark.sql.connector.catalog.Identifier[] listTables(java.lang.String[] namespace)
-
defaultNamespace
public java.lang.String[] defaultNamespace()
-
listNamespaces
public java.lang.String[][] listNamespaces()
-
listNamespaces
public java.lang.String[][] listNamespaces(java.lang.String[] namespace) throws org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException
- Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException
-
loadNamespaceMetadata
public java.util.Map<java.lang.String,java.lang.String> loadNamespaceMetadata(java.lang.String[] namespace) throws org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException
- Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException
-
createNamespace
public void createNamespace(java.lang.String[] namespace, java.util.Map<java.lang.String,java.lang.String> metadata) throws org.apache.spark.sql.catalyst.analysis.NamespaceAlreadyExistsException
- Throws:
org.apache.spark.sql.catalyst.analysis.NamespaceAlreadyExistsException
-
alterNamespace
public void alterNamespace(java.lang.String[] namespace, org.apache.spark.sql.connector.catalog.NamespaceChange... changes) throws org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException
- Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException
-
dropNamespace
public boolean dropNamespace(java.lang.String[] namespace, boolean cascade) throws org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException
- Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException
-
initialize
public final void initialize(java.lang.String name, org.apache.spark.sql.util.CaseInsensitiveStringMap options)
-
name
public java.lang.String name()
-
icebergCatalog
public Catalog icebergCatalog()
Description copied from interface:HasIcebergCatalog
Returns the underlyingCatalog
backing this Spark Catalog
-
loadProcedure
public Procedure loadProcedure(org.apache.spark.sql.connector.catalog.Identifier ident) throws NoSuchProcedureException
Description copied from interface:ProcedureCatalog
Load astored procedure
byidentifier
.- Specified by:
loadProcedure
in interfaceProcedureCatalog
- Parameters:
ident
- a stored procedure identifier- Returns:
- the stored procedure's metadata
- Throws:
NoSuchProcedureException
- if there is no matching stored procedure
-
isFunctionNamespace
public boolean isFunctionNamespace(java.lang.String[] namespace)
-
isExistingNamespace
public boolean isExistingNamespace(java.lang.String[] namespace)
-
listFunctions
public default org.apache.spark.sql.connector.catalog.Identifier[] listFunctions(java.lang.String[] namespace) throws org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException
- Specified by:
listFunctions
in interfaceorg.apache.spark.sql.connector.catalog.FunctionCatalog
- Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException
-
loadFunction
public default org.apache.spark.sql.connector.catalog.functions.UnboundFunction loadFunction(org.apache.spark.sql.connector.catalog.Identifier ident) throws org.apache.spark.sql.catalyst.analysis.NoSuchFunctionException
- Specified by:
loadFunction
in interfaceorg.apache.spark.sql.connector.catalog.FunctionCatalog
- Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchFunctionException
-
-