Package org.apache.iceberg.spark.actions
Class BaseSnapshotTableSparkAction
- java.lang.Object
-
- org.apache.iceberg.spark.actions.BaseSnapshotTableSparkAction
-
- All Implemented Interfaces:
Action<SnapshotTable,SnapshotTable.Result>
,SnapshotTable
public class BaseSnapshotTableSparkAction extends java.lang.Object implements SnapshotTable
Creates a new Iceberg table based on a source Spark table. The new Iceberg table will have a different data and metadata directory allowing it to exist independently of the source table.
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from interface org.apache.iceberg.actions.SnapshotTable
SnapshotTable.Result
-
-
Field Summary
Fields Modifier and Type Field Description protected static java.util.List<java.lang.String>
EXCLUDED_PROPERTIES
protected static java.lang.String
ICEBERG_METADATA_FOLDER
protected static java.lang.String
LOCATION
-
Constructor Summary
Constructors Constructor Description BaseSnapshotTableSparkAction(org.apache.spark.sql.SparkSession spark, org.apache.spark.sql.connector.catalog.CatalogPlugin sourceCatalog, org.apache.spark.sql.connector.catalog.Identifier sourceTableIdent, org.apache.spark.sql.connector.catalog.CatalogPlugin destCatalog, org.apache.spark.sql.connector.catalog.Identifier destTableIdent)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected java.util.Map<java.lang.String,java.lang.String>
additionalProperties()
SnapshotTable
as(java.lang.String ident)
Sets the table identifier for the newly created Iceberg table.protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>
buildManifestFileDF(Table table)
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>
buildManifestListDF(Table table)
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>
buildOtherMetadataFileDF(Table table)
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>
buildValidDataFileDF(Table table)
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>
buildValidMetadataFileDF(Table table)
protected org.apache.spark.sql.connector.catalog.StagingTableCatalog
checkDestinationCatalog(org.apache.spark.sql.connector.catalog.CatalogPlugin catalog)
protected org.apache.spark.sql.connector.catalog.TableCatalog
checkSourceCatalog(org.apache.spark.sql.connector.catalog.CatalogPlugin catalog)
protected org.apache.spark.sql.connector.catalog.StagingTableCatalog
destCatalog()
protected org.apache.spark.sql.connector.catalog.Identifier
destTableIdent()
protected java.util.Map<java.lang.String,java.lang.String>
destTableProps()
protected void
ensureNameMappingPresent(Table table)
SnapshotTable.Result
execute()
Executes this action.protected java.lang.String
getMetadataLocation(Table table)
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>
loadMetadataTable(Table table, MetadataTableType type)
protected JobGroupInfo
newJobGroupInfo(java.lang.String groupId, java.lang.String desc)
protected Table
newStaticTable(TableMetadata metadata, FileIO io)
ThisT
option(java.lang.String name, java.lang.String value)
Configures this action with an extra option.protected java.util.Map<java.lang.String,java.lang.String>
options()
ThisT
options(java.util.Map<java.lang.String,java.lang.String> newOptions)
Configures this action with extra options.protected SnapshotTable
self()
protected void
setProperties(java.util.Map<java.lang.String,java.lang.String> properties)
protected void
setProperty(java.lang.String key, java.lang.String value)
protected org.apache.spark.sql.connector.catalog.TableCatalog
sourceCatalog()
protected org.apache.spark.sql.connector.catalog.Identifier
sourceTableIdent()
protected java.lang.String
sourceTableLocation()
protected org.apache.spark.sql.SparkSession
spark()
protected org.apache.spark.api.java.JavaSparkContext
sparkContext()
protected StagedSparkTable
stageDestTable()
SnapshotTable
tableLocation(java.lang.String location)
Sets the table location for the newly created Iceberg table.SnapshotTable
tableProperties(java.util.Map<java.lang.String,java.lang.String> properties)
Sets table properties in the newly created Iceberg table.SnapshotTable
tableProperty(java.lang.String property, java.lang.String value)
Sets a table property in the newly created Iceberg table.protected org.apache.spark.sql.catalyst.catalog.CatalogTable
v1SourceTable()
protected <T> T
withJobGroupInfo(JobGroupInfo info, java.util.function.Supplier<T> supplier)
-
-
-
Field Detail
-
LOCATION
protected static final java.lang.String LOCATION
- See Also:
- Constant Field Values
-
ICEBERG_METADATA_FOLDER
protected static final java.lang.String ICEBERG_METADATA_FOLDER
- See Also:
- Constant Field Values
-
EXCLUDED_PROPERTIES
protected static final java.util.List<java.lang.String> EXCLUDED_PROPERTIES
-
-
Constructor Detail
-
BaseSnapshotTableSparkAction
public BaseSnapshotTableSparkAction(org.apache.spark.sql.SparkSession spark, org.apache.spark.sql.connector.catalog.CatalogPlugin sourceCatalog, org.apache.spark.sql.connector.catalog.Identifier sourceTableIdent, org.apache.spark.sql.connector.catalog.CatalogPlugin destCatalog, org.apache.spark.sql.connector.catalog.Identifier destTableIdent)
-
-
Method Detail
-
self
protected SnapshotTable self()
-
destCatalog
protected org.apache.spark.sql.connector.catalog.StagingTableCatalog destCatalog()
-
destTableIdent
protected org.apache.spark.sql.connector.catalog.Identifier destTableIdent()
-
as
public SnapshotTable as(java.lang.String ident)
Description copied from interface:SnapshotTable
Sets the table identifier for the newly created Iceberg table.- Specified by:
as
in interfaceSnapshotTable
- Parameters:
ident
- the destination table identifier- Returns:
- this for method chaining
-
tableProperties
public SnapshotTable tableProperties(java.util.Map<java.lang.String,java.lang.String> properties)
Description copied from interface:SnapshotTable
Sets table properties in the newly created Iceberg table. Any properties with the same key name will be overwritten.- Specified by:
tableProperties
in interfaceSnapshotTable
- Parameters:
properties
- a map of properties to be included- Returns:
- this for method chaining
-
tableProperty
public SnapshotTable tableProperty(java.lang.String property, java.lang.String value)
Description copied from interface:SnapshotTable
Sets a table property in the newly created Iceberg table. Any properties with the same key name will be overwritten.- Specified by:
tableProperty
in interfaceSnapshotTable
- Parameters:
property
- the key of the property to addvalue
- the value of the property to add- Returns:
- this for method chaining
-
execute
public SnapshotTable.Result execute()
Description copied from interface:Action
Executes this action.- Specified by:
execute
in interfaceAction<SnapshotTable,SnapshotTable.Result>
- Returns:
- the result of this action
-
destTableProps
protected java.util.Map<java.lang.String,java.lang.String> destTableProps()
-
checkSourceCatalog
protected org.apache.spark.sql.connector.catalog.TableCatalog checkSourceCatalog(org.apache.spark.sql.connector.catalog.CatalogPlugin catalog)
-
tableLocation
public SnapshotTable tableLocation(java.lang.String location)
Description copied from interface:SnapshotTable
Sets the table location for the newly created Iceberg table.- Specified by:
tableLocation
in interfaceSnapshotTable
- Parameters:
location
- a table location- Returns:
- this for method chaining
-
sourceTableLocation
protected java.lang.String sourceTableLocation()
-
v1SourceTable
protected org.apache.spark.sql.catalyst.catalog.CatalogTable v1SourceTable()
-
sourceCatalog
protected org.apache.spark.sql.connector.catalog.TableCatalog sourceCatalog()
-
sourceTableIdent
protected org.apache.spark.sql.connector.catalog.Identifier sourceTableIdent()
-
setProperties
protected void setProperties(java.util.Map<java.lang.String,java.lang.String> properties)
-
setProperty
protected void setProperty(java.lang.String key, java.lang.String value)
-
additionalProperties
protected java.util.Map<java.lang.String,java.lang.String> additionalProperties()
-
checkDestinationCatalog
protected org.apache.spark.sql.connector.catalog.StagingTableCatalog checkDestinationCatalog(org.apache.spark.sql.connector.catalog.CatalogPlugin catalog)
-
stageDestTable
protected StagedSparkTable stageDestTable()
-
ensureNameMappingPresent
protected void ensureNameMappingPresent(Table table)
-
getMetadataLocation
protected java.lang.String getMetadataLocation(Table table)
-
spark
protected org.apache.spark.sql.SparkSession spark()
-
sparkContext
protected org.apache.spark.api.java.JavaSparkContext sparkContext()
-
option
public ThisT option(java.lang.String name, java.lang.String value)
Description copied from interface:Action
Configures this action with an extra option.Certain actions allow users to control internal details of their execution via options.
-
options
public ThisT options(java.util.Map<java.lang.String,java.lang.String> newOptions)
Description copied from interface:Action
Configures this action with extra options.Certain actions allow users to control internal details of their execution via options.
-
options
protected java.util.Map<java.lang.String,java.lang.String> options()
-
withJobGroupInfo
protected <T> T withJobGroupInfo(JobGroupInfo info, java.util.function.Supplier<T> supplier)
-
newJobGroupInfo
protected JobGroupInfo newJobGroupInfo(java.lang.String groupId, java.lang.String desc)
-
newStaticTable
protected Table newStaticTable(TableMetadata metadata, FileIO io)
-
buildValidDataFileDF
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> buildValidDataFileDF(Table table)
-
buildManifestFileDF
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> buildManifestFileDF(Table table)
-
buildManifestListDF
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> buildManifestListDF(Table table)
-
buildOtherMetadataFileDF
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> buildOtherMetadataFileDF(Table table)
-
buildValidMetadataFileDF
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> buildValidMetadataFileDF(Table table)
-
loadMetadataTable
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> loadMetadataTable(Table table, MetadataTableType type)
-
-