java.lang.Object
- org.apache.iceberg.spark.actions.MigrateTableSparkAction

All Implemented Interfaces:

Action<MigrateTable,MigrateTable.Result>, MigrateTable
```
public class MigrateTableSparkAction
extends java.lang.Object
implements MigrateTable
```
Takes a Spark table in the source catalog and attempts to transform it into an Iceberg table in the same location with the same identifier. Once complete the identifier which previously referred to a non-Iceberg table will refer to the newly migrated Iceberg table.

Nested Class Summary
- Nested classes/interfaces inherited from interface org.apache.iceberg.actions.MigrateTable
  MigrateTable.Result

Field Summary

Fields
Modifier and Type	Field	Description
`protected static org.apache.iceberg.relocated.com.google.common.base.Joiner`	`COMMA_JOINER`
`protected static org.apache.iceberg.relocated.com.google.common.base.Splitter`	`COMMA_SPLITTER`
`protected static java.util.List<java.lang.String>`	`EXCLUDED_PROPERTIES`
`protected static java.lang.String`	`FILE_PATH`
`protected static java.lang.String`	`ICEBERG_METADATA_FOLDER`
`protected static java.lang.String`	`LAST_MODIFIED`
`protected static java.lang.String`	`LOCATION`
`protected static java.lang.String`	`MANIFEST`
`protected static java.lang.String`	`MANIFEST_LIST`
`protected static java.lang.String`	`OTHERS`
`protected static java.lang.String`	`STATISTICS_FILES`

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method	Description
`protected java.util.Map<java.lang.String,java.lang.String>`	`additionalProperties()`
`protected org.apache.spark.sql.Dataset<FileInfo>`	`allReachableOtherMetadataFileDS(Table table)`
`MigrateTableSparkAction`	`backupTableName(java.lang.String tableName)`	Sets a table name for the backup of the original table.
`protected org.apache.spark.sql.connector.catalog.StagingTableCatalog`	`checkDestinationCatalog(org.apache.spark.sql.connector.catalog.CatalogPlugin catalog)`
`protected org.apache.spark.sql.connector.catalog.TableCatalog`	`checkSourceCatalog(org.apache.spark.sql.connector.catalog.CatalogPlugin catalog)`
`protected org.apache.spark.sql.Dataset<FileInfo>`	`contentFileDS(Table table)`
`protected org.apache.spark.sql.Dataset<FileInfo>`	`contentFileDS(Table table, java.util.Set<java.lang.Long> snapshotIds)`
`protected org.apache.iceberg.spark.actions.BaseSparkAction.DeleteSummary`	`deleteFiles(java.util.concurrent.ExecutorService executorService, java.util.function.Consumer<java.lang.String> deleteFunc, java.util.Iterator<FileInfo> files)`	Deletes files and keeps track of how many files were removed for each file type.
`protected org.apache.iceberg.spark.actions.BaseSparkAction.DeleteSummary`	`deleteFiles(SupportsBulkOperations io, java.util.Iterator<FileInfo> files)`
`protected org.apache.spark.sql.connector.catalog.StagingTableCatalog`	`destCatalog()`
`protected org.apache.spark.sql.connector.catalog.Identifier`	`destTableIdent()`
`protected java.util.Map<java.lang.String,java.lang.String>`	`destTableProps()`
`MigrateTableSparkAction`	`dropBackup()`	Drops the backup of the original table after a successful migration
`protected void`	`ensureNameMappingPresent(Table table)`
`MigrateTable.Result`	`execute()`	Executes this action.
`MigrateTableSparkAction`	`executeWith(java.util.concurrent.ExecutorService service)`	Sets the executor service to use for parallel file reading.
`protected java.lang.String`	`getMetadataLocation(Table table)`
`protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>`	`loadMetadataTable(Table table, MetadataTableType type)`
`protected org.apache.spark.sql.Dataset<FileInfo>`	`manifestDS(Table table)`
`protected org.apache.spark.sql.Dataset<FileInfo>`	`manifestDS(Table table, java.util.Set<java.lang.Long> snapshotIds)`
`protected org.apache.spark.sql.Dataset<FileInfo>`	`manifestListDS(Table table)`
`protected org.apache.spark.sql.Dataset<FileInfo>`	`manifestListDS(Table table, java.util.Set<java.lang.Long> snapshotIds)`
`protected JobGroupInfo`	`newJobGroupInfo(java.lang.String groupId, java.lang.String desc)`
`protected Table`	`newStaticTable(TableMetadata metadata, FileIO io)`
`ThisT`	`option(java.lang.String name, java.lang.String value)`
`protected java.util.Map<java.lang.String,java.lang.String>`	`options()`
`ThisT`	`options(java.util.Map<java.lang.String,java.lang.String> newOptions)`
`protected org.apache.spark.sql.Dataset<FileInfo>`	`otherMetadataFileDS(Table table)`
`protected MigrateTableSparkAction`	`self()`
`protected void`	`setProperties(java.util.Map<java.lang.String,java.lang.String> properties)`
`protected void`	`setProperty(java.lang.String key, java.lang.String value)`
`protected org.apache.spark.sql.connector.catalog.TableCatalog`	`sourceCatalog()`
`protected org.apache.spark.sql.connector.catalog.Identifier`	`sourceTableIdent()`
`protected java.lang.String`	`sourceTableLocation()`
`protected org.apache.spark.sql.SparkSession`	`spark()`
`protected org.apache.spark.api.java.JavaSparkContext`	`sparkContext()`
`protected StagedSparkTable`	`stageDestTable()`
`protected org.apache.spark.sql.Dataset<FileInfo>`	`statisticsFileDS(Table table, java.util.Set<java.lang.Long> snapshotIds)`
`MigrateTableSparkAction`	`tableProperties(java.util.Map<java.lang.String,java.lang.String> properties)`	Sets table properties in the newly created Iceberg table.
`MigrateTableSparkAction`	`tableProperty(java.lang.String property, java.lang.String value)`	Sets a table property in the newly created Iceberg table.
`protected org.apache.spark.sql.catalyst.catalog.CatalogTable`	`v1SourceTable()`
`protected <T> T`	`withJobGroupInfo(JobGroupInfo info, java.util.function.Supplier<T> supplier)`

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Methods inherited from interface org.apache.iceberg.actions.Action
option, options

Field Detail

LOCATION

protected static final java.lang.String LOCATION

See Also:: Constant Field Values

ICEBERG_METADATA_FOLDER

protected static final java.lang.String ICEBERG_METADATA_FOLDER

See Also:: Constant Field Values

EXCLUDED_PROPERTIES

protected static final java.util.List<java.lang.String> EXCLUDED_PROPERTIES

MANIFEST

protected static final java.lang.String MANIFEST

See Also:: Constant Field Values

MANIFEST_LIST

protected static final java.lang.String MANIFEST_LIST

See Also:: Constant Field Values

STATISTICS_FILES

protected static final java.lang.String STATISTICS_FILES

See Also:: Constant Field Values

OTHERS

protected static final java.lang.String OTHERS

See Also:: Constant Field Values

FILE_PATH

protected static final java.lang.String FILE_PATH

See Also:: Constant Field Values

LAST_MODIFIED

protected static final java.lang.String LAST_MODIFIED

See Also:: Constant Field Values

COMMA_SPLITTER

protected static final org.apache.iceberg.relocated.com.google.common.base.Splitter COMMA_SPLITTER

COMMA_JOINER

protected static final org.apache.iceberg.relocated.com.google.common.base.Joiner COMMA_JOINER

Method Detail

self

protected MigrateTableSparkAction self()

destCatalog

protected org.apache.spark.sql.connector.catalog.StagingTableCatalog destCatalog()

destTableIdent

protected org.apache.spark.sql.connector.catalog.Identifier destTableIdent()

tableProperties
```
public MigrateTableSparkAction tableProperties(java.util.Map<java.lang.String,java.lang.String> properties)
```
Description copied from interface: MigrateTable

Sets table properties in the newly created Iceberg table. Any properties with the same key name will be overwritten.

Specified by:

tableProperties in interface MigrateTable

Parameters:

properties - a map of properties to set

Returns:

this for method chaining

tableProperty
```
public MigrateTableSparkAction tableProperty(java.lang.String property,
                                             java.lang.String value)
```
Description copied from interface: MigrateTable

Sets a table property in the newly created Iceberg table. Any properties with the same key will be overwritten.

Specified by:

tableProperty in interface MigrateTable

Parameters:

property - a table property name

value - a table property value

Returns:

this for method chaining

dropBackup
```
public MigrateTableSparkAction dropBackup()
```
Description copied from interface: MigrateTable

Drops the backup of the original table after a successful migration

Specified by:

dropBackup in interface MigrateTable

Returns:

this for method chaining

backupTableName
```
public MigrateTableSparkAction backupTableName(java.lang.String tableName)
```
Description copied from interface: MigrateTable

Sets a table name for the backup of the original table.

Specified by:

backupTableName in interface MigrateTable

Parameters:

tableName - the table name for backup

Returns:

this for method chaining

executeWith
```
public MigrateTableSparkAction executeWith(java.util.concurrent.ExecutorService service)
```
Description copied from interface: MigrateTable

Sets the executor service to use for parallel file reading. The default is not using executor service.

Specified by:

executeWith in interface MigrateTable

Parameters:

service - executor service

Returns:

this for method chaining

execute
```
public MigrateTable.Result execute()
```
Description copied from interface: Action

Executes this action.

Specified by:

execute in interface Action<MigrateTable,MigrateTable.Result>

Returns:

the result of this action

destTableProps

protected java.util.Map<java.lang.String,java.lang.String> destTableProps()

checkSourceCatalog

protected org.apache.spark.sql.connector.catalog.TableCatalog checkSourceCatalog(org.apache.spark.sql.connector.catalog.CatalogPlugin catalog)

sourceTableLocation

protected java.lang.String sourceTableLocation()

v1SourceTable

protected org.apache.spark.sql.catalyst.catalog.CatalogTable v1SourceTable()

sourceCatalog

protected org.apache.spark.sql.connector.catalog.TableCatalog sourceCatalog()

sourceTableIdent

protected org.apache.spark.sql.connector.catalog.Identifier sourceTableIdent()

setProperties

protected void setProperties(java.util.Map<java.lang.String,java.lang.String> properties)

setProperty

protected void setProperty(java.lang.String key,
                           java.lang.String value)

additionalProperties

protected java.util.Map<java.lang.String,java.lang.String> additionalProperties()

checkDestinationCatalog

protected org.apache.spark.sql.connector.catalog.StagingTableCatalog checkDestinationCatalog(org.apache.spark.sql.connector.catalog.CatalogPlugin catalog)

stageDestTable

protected StagedSparkTable stageDestTable()

ensureNameMappingPresent

protected void ensureNameMappingPresent(Table table)

getMetadataLocation

protected java.lang.String getMetadataLocation(Table table)

spark

protected org.apache.spark.sql.SparkSession spark()

sparkContext

protected org.apache.spark.api.java.JavaSparkContext sparkContext()

option

public ThisT option(java.lang.String name,
                    java.lang.String value)

options

public ThisT options(java.util.Map<java.lang.String,java.lang.String> newOptions)

options

protected java.util.Map<java.lang.String,java.lang.String> options()

withJobGroupInfo

protected <T> T withJobGroupInfo(JobGroupInfo info,
                                 java.util.function.Supplier<T> supplier)

newJobGroupInfo

protected JobGroupInfo newJobGroupInfo(java.lang.String groupId,
                                       java.lang.String desc)

newStaticTable

protected Table newStaticTable(TableMetadata metadata,
                               FileIO io)

contentFileDS

protected org.apache.spark.sql.Dataset<FileInfo> contentFileDS(Table table)

contentFileDS

protected org.apache.spark.sql.Dataset<FileInfo> contentFileDS(Table table,
                                                               java.util.Set<java.lang.Long> snapshotIds)

manifestDS

protected org.apache.spark.sql.Dataset<FileInfo> manifestDS(Table table)

manifestDS

protected org.apache.spark.sql.Dataset<FileInfo> manifestDS(Table table,
                                                            java.util.Set<java.lang.Long> snapshotIds)

manifestListDS

protected org.apache.spark.sql.Dataset<FileInfo> manifestListDS(Table table)

manifestListDS

protected org.apache.spark.sql.Dataset<FileInfo> manifestListDS(Table table,
                                                                java.util.Set<java.lang.Long> snapshotIds)

statisticsFileDS

protected org.apache.spark.sql.Dataset<FileInfo> statisticsFileDS(Table table,
                                                                  java.util.Set<java.lang.Long> snapshotIds)

otherMetadataFileDS

protected org.apache.spark.sql.Dataset<FileInfo> otherMetadataFileDS(Table table)

allReachableOtherMetadataFileDS

protected org.apache.spark.sql.Dataset<FileInfo> allReachableOtherMetadataFileDS(Table table)

loadMetadataTable

protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> loadMetadataTable(Table table,
                                                                                   MetadataTableType type)

deleteFiles

protected org.apache.iceberg.spark.actions.BaseSparkAction.DeleteSummary deleteFiles(java.util.concurrent.ExecutorService executorService,
                                                                                     java.util.function.Consumer<java.lang.String> deleteFunc,
                                                                                     java.util.Iterator<FileInfo> files)

Deletes files and keeps track of how many files were removed for each file type.

Parameters:: executorService - an executor service to use for parallel deletes; deleteFunc - a delete func; files - an iterator of Spark rows of the structure (path: String, type: String)
Returns:: stats on which files were deleted

deleteFiles

protected org.apache.iceberg.spark.actions.BaseSparkAction.DeleteSummary deleteFiles(SupportsBulkOperations io,
                                                                                     java.util.Iterator<FileInfo> files)

Class MigrateTableSparkAction

Nested Class Summary

Nested classes/interfaces inherited from interface org.apache.iceberg.actions.MigrateTable

Field Summary

Method Summary

Methods inherited from class java.lang.Object

Methods inherited from interface org.apache.iceberg.actions.Action

Field Detail

LOCATION

ICEBERG_METADATA_FOLDER

EXCLUDED_PROPERTIES

MANIFEST

MANIFEST_LIST

STATISTICS_FILES

OTHERS

FILE_PATH

LAST_MODIFIED

COMMA_SPLITTER

COMMA_JOINER

Method Detail

self

destCatalog

destTableIdent

tableProperties

tableProperty

dropBackup

backupTableName

executeWith

execute

destTableProps

checkSourceCatalog

sourceTableLocation

v1SourceTable

sourceCatalog

sourceTableIdent

setProperties

setProperty

additionalProperties

checkDestinationCatalog

stageDestTable

ensureNameMappingPresent

getMetadataLocation

spark

sparkContext

option

options

options

withJobGroupInfo

newJobGroupInfo

newStaticTable

contentFileDS

contentFileDS

manifestDS

manifestDS

manifestListDS

manifestListDS

statisticsFileDS

otherMetadataFileDS

allReachableOtherMetadataFileDS

loadMetadataTable

deleteFiles

deleteFiles