Package org.apache.iceberg.spark.actions
Class BaseSnapshotTableSparkAction
- java.lang.Object
-
- org.apache.iceberg.spark.actions.BaseSnapshotTableSparkAction
-
- All Implemented Interfaces:
Action<SnapshotTable,SnapshotTable.Result>,SnapshotTable
public class BaseSnapshotTableSparkAction extends java.lang.Object implements SnapshotTable
Creates a new Iceberg table based on a source Spark table. The new Iceberg table will have a different data and metadata directory allowing it to exist independently of the source table.
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from interface org.apache.iceberg.actions.SnapshotTable
SnapshotTable.Result
-
-
Field Summary
Fields Modifier and Type Field Description protected static java.util.List<java.lang.String>EXCLUDED_PROPERTIESprotected static java.lang.StringICEBERG_METADATA_FOLDERprotected static java.lang.StringLOCATION
-
Constructor Summary
Constructors Constructor Description BaseSnapshotTableSparkAction(org.apache.spark.sql.SparkSession spark, org.apache.spark.sql.connector.catalog.CatalogPlugin sourceCatalog, org.apache.spark.sql.connector.catalog.Identifier sourceTableIdent, org.apache.spark.sql.connector.catalog.CatalogPlugin destCatalog, org.apache.spark.sql.connector.catalog.Identifier destTableIdent)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected java.util.Map<java.lang.String,java.lang.String>additionalProperties()SnapshotTableas(java.lang.String ident)Sets the table identifier for the newly created Iceberg table.protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>buildManifestFileDF(Table table)protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>buildManifestListDF(Table table)protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>buildOtherMetadataFileDF(Table table)protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>buildValidDataFileDF(Table table)protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>buildValidMetadataFileDF(Table table)protected org.apache.spark.sql.connector.catalog.StagingTableCatalogcheckDestinationCatalog(org.apache.spark.sql.connector.catalog.CatalogPlugin catalog)protected org.apache.spark.sql.connector.catalog.TableCatalogcheckSourceCatalog(org.apache.spark.sql.connector.catalog.CatalogPlugin catalog)protected org.apache.spark.sql.connector.catalog.StagingTableCatalogdestCatalog()protected org.apache.spark.sql.connector.catalog.IdentifierdestTableIdent()protected java.util.Map<java.lang.String,java.lang.String>destTableProps()protected voidensureNameMappingPresent(Table table)SnapshotTable.Resultexecute()Executes this action.protected java.lang.StringgetMetadataLocation(Table table)protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>loadMetadataTable(Table table, MetadataTableType type)protected JobGroupInfonewJobGroupInfo(java.lang.String groupId, java.lang.String desc)protected TablenewStaticTable(TableMetadata metadata, FileIO io)ThisToption(java.lang.String name, java.lang.String value)Configures this action with an extra option.protected java.util.Map<java.lang.String,java.lang.String>options()ThisToptions(java.util.Map<java.lang.String,java.lang.String> newOptions)Configures this action with extra options.protected SnapshotTableself()protected voidsetProperties(java.util.Map<java.lang.String,java.lang.String> properties)protected voidsetProperty(java.lang.String key, java.lang.String value)protected org.apache.spark.sql.connector.catalog.TableCatalogsourceCatalog()protected org.apache.spark.sql.connector.catalog.IdentifiersourceTableIdent()protected java.lang.StringsourceTableLocation()protected org.apache.spark.sql.SparkSessionspark()protected org.apache.spark.api.java.JavaSparkContextsparkContext()protected StagedSparkTablestageDestTable()SnapshotTabletableLocation(java.lang.String location)Sets the table location for the newly created Iceberg table.SnapshotTabletableProperties(java.util.Map<java.lang.String,java.lang.String> properties)Sets table properties in the newly created Iceberg table.SnapshotTabletableProperty(java.lang.String property, java.lang.String value)Sets a table property in the newly created Iceberg table.protected org.apache.spark.sql.catalyst.catalog.CatalogTablev1SourceTable()protected <T> TwithJobGroupInfo(JobGroupInfo info, java.util.function.Supplier<T> supplier)
-
-
-
Field Detail
-
LOCATION
protected static final java.lang.String LOCATION
- See Also:
- Constant Field Values
-
ICEBERG_METADATA_FOLDER
protected static final java.lang.String ICEBERG_METADATA_FOLDER
- See Also:
- Constant Field Values
-
EXCLUDED_PROPERTIES
protected static final java.util.List<java.lang.String> EXCLUDED_PROPERTIES
-
-
Constructor Detail
-
BaseSnapshotTableSparkAction
public BaseSnapshotTableSparkAction(org.apache.spark.sql.SparkSession spark, org.apache.spark.sql.connector.catalog.CatalogPlugin sourceCatalog, org.apache.spark.sql.connector.catalog.Identifier sourceTableIdent, org.apache.spark.sql.connector.catalog.CatalogPlugin destCatalog, org.apache.spark.sql.connector.catalog.Identifier destTableIdent)
-
-
Method Detail
-
self
protected SnapshotTable self()
-
destCatalog
protected org.apache.spark.sql.connector.catalog.StagingTableCatalog destCatalog()
-
destTableIdent
protected org.apache.spark.sql.connector.catalog.Identifier destTableIdent()
-
as
public SnapshotTable as(java.lang.String ident)
Description copied from interface:SnapshotTableSets the table identifier for the newly created Iceberg table.- Specified by:
asin interfaceSnapshotTable- Parameters:
ident- the destination table identifier- Returns:
- this for method chaining
-
tableProperties
public SnapshotTable tableProperties(java.util.Map<java.lang.String,java.lang.String> properties)
Description copied from interface:SnapshotTableSets table properties in the newly created Iceberg table. Any properties with the same key name will be overwritten.- Specified by:
tablePropertiesin interfaceSnapshotTable- Parameters:
properties- a map of properties to be included- Returns:
- this for method chaining
-
tableProperty
public SnapshotTable tableProperty(java.lang.String property, java.lang.String value)
Description copied from interface:SnapshotTableSets a table property in the newly created Iceberg table. Any properties with the same key name will be overwritten.- Specified by:
tablePropertyin interfaceSnapshotTable- Parameters:
property- the key of the property to addvalue- the value of the property to add- Returns:
- this for method chaining
-
execute
public SnapshotTable.Result execute()
Description copied from interface:ActionExecutes this action.- Specified by:
executein interfaceAction<SnapshotTable,SnapshotTable.Result>- Returns:
- the result of this action
-
destTableProps
protected java.util.Map<java.lang.String,java.lang.String> destTableProps()
-
checkSourceCatalog
protected org.apache.spark.sql.connector.catalog.TableCatalog checkSourceCatalog(org.apache.spark.sql.connector.catalog.CatalogPlugin catalog)
-
tableLocation
public SnapshotTable tableLocation(java.lang.String location)
Description copied from interface:SnapshotTableSets the table location for the newly created Iceberg table.- Specified by:
tableLocationin interfaceSnapshotTable- Parameters:
location- a table location- Returns:
- this for method chaining
-
sourceTableLocation
protected java.lang.String sourceTableLocation()
-
v1SourceTable
protected org.apache.spark.sql.catalyst.catalog.CatalogTable v1SourceTable()
-
sourceCatalog
protected org.apache.spark.sql.connector.catalog.TableCatalog sourceCatalog()
-
sourceTableIdent
protected org.apache.spark.sql.connector.catalog.Identifier sourceTableIdent()
-
setProperties
protected void setProperties(java.util.Map<java.lang.String,java.lang.String> properties)
-
setProperty
protected void setProperty(java.lang.String key, java.lang.String value)
-
additionalProperties
protected java.util.Map<java.lang.String,java.lang.String> additionalProperties()
-
checkDestinationCatalog
protected org.apache.spark.sql.connector.catalog.StagingTableCatalog checkDestinationCatalog(org.apache.spark.sql.connector.catalog.CatalogPlugin catalog)
-
stageDestTable
protected StagedSparkTable stageDestTable()
-
ensureNameMappingPresent
protected void ensureNameMappingPresent(Table table)
-
getMetadataLocation
protected java.lang.String getMetadataLocation(Table table)
-
spark
protected org.apache.spark.sql.SparkSession spark()
-
sparkContext
protected org.apache.spark.api.java.JavaSparkContext sparkContext()
-
option
public ThisT option(java.lang.String name, java.lang.String value)Description copied from interface:ActionConfigures this action with an extra option.Certain actions allow users to control internal details of their execution via options.
-
options
public ThisT options(java.util.Map<java.lang.String,java.lang.String> newOptions)
Description copied from interface:ActionConfigures this action with extra options.Certain actions allow users to control internal details of their execution via options.
-
options
protected java.util.Map<java.lang.String,java.lang.String> options()
-
withJobGroupInfo
protected <T> T withJobGroupInfo(JobGroupInfo info, java.util.function.Supplier<T> supplier)
-
newJobGroupInfo
protected JobGroupInfo newJobGroupInfo(java.lang.String groupId, java.lang.String desc)
-
newStaticTable
protected Table newStaticTable(TableMetadata metadata, FileIO io)
-
buildValidDataFileDF
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> buildValidDataFileDF(Table table)
-
buildManifestFileDF
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> buildManifestFileDF(Table table)
-
buildManifestListDF
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> buildManifestListDF(Table table)
-
buildOtherMetadataFileDF
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> buildOtherMetadataFileDF(Table table)
-
buildValidMetadataFileDF
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> buildValidMetadataFileDF(Table table)
-
loadMetadataTable
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> loadMetadataTable(Table table, MetadataTableType type)
-
-