Package org.apache.iceberg.spark.actions
Class DeleteReachableFilesSparkAction
- java.lang.Object
-
- org.apache.iceberg.spark.actions.DeleteReachableFilesSparkAction
-
- All Implemented Interfaces:
Action<DeleteReachableFiles,DeleteReachableFiles.Result>
,DeleteReachableFiles
public class DeleteReachableFilesSparkAction extends java.lang.Object implements DeleteReachableFiles
An implementation ofDeleteReachableFiles
that uses metadata tables in Spark to determine which files should be deleted.
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from interface org.apache.iceberg.actions.DeleteReachableFiles
DeleteReachableFiles.Result
-
-
Field Summary
Fields Modifier and Type Field Description protected static java.lang.String
CONTENT_FILE
protected static java.lang.String
FILE_PATH
protected static java.lang.String
FILE_TYPE
protected static java.lang.String
LAST_MODIFIED
protected static java.lang.String
MANIFEST
protected static java.lang.String
MANIFEST_LIST
protected static java.lang.String
OTHERS
static java.lang.String
STREAM_RESULTS
static boolean
STREAM_RESULTS_DEFAULT
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>
buildAllReachableOtherMetadataFileDF(Table table)
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>
buildManifestFileDF(Table table)
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>
buildManifestListDF(Table table)
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>
buildOtherMetadataFileDF(Table table)
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>
buildValidContentFileDF(Table table)
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>
buildValidContentFileWithTypeDF(Table table)
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>
buildValidMetadataFileDF(Table table)
DeleteReachableFilesSparkAction
deleteWith(java.util.function.Consumer<java.lang.String> newDeleteFunc)
Passes an alternative delete implementation that will be used for files.DeleteReachableFiles.Result
execute()
Executes this action.DeleteReachableFilesSparkAction
executeDeleteWith(java.util.concurrent.ExecutorService executorService)
Passes an alternative executor service that will be used for files removal.DeleteReachableFilesSparkAction
io(FileIO fileIO)
Set theFileIO
to be used for files removalprotected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>
loadMetadataTable(Table table, MetadataTableType type)
protected JobGroupInfo
newJobGroupInfo(java.lang.String groupId, java.lang.String desc)
protected Table
newStaticTable(TableMetadata metadata, FileIO io)
ThisT
option(java.lang.String name, java.lang.String value)
protected java.util.Map<java.lang.String,java.lang.String>
options()
ThisT
options(java.util.Map<java.lang.String,java.lang.String> newOptions)
protected DeleteReachableFilesSparkAction
self()
protected org.apache.spark.sql.SparkSession
spark()
protected org.apache.spark.api.java.JavaSparkContext
sparkContext()
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>
withFileType(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> ds, java.lang.String type)
protected <T> T
withJobGroupInfo(JobGroupInfo info, java.util.function.Supplier<T> supplier)
-
-
-
Field Detail
-
STREAM_RESULTS
public static final java.lang.String STREAM_RESULTS
- See Also:
- Constant Field Values
-
STREAM_RESULTS_DEFAULT
public static final boolean STREAM_RESULTS_DEFAULT
- See Also:
- Constant Field Values
-
CONTENT_FILE
protected static final java.lang.String CONTENT_FILE
- See Also:
- Constant Field Values
-
MANIFEST
protected static final java.lang.String MANIFEST
- See Also:
- Constant Field Values
-
MANIFEST_LIST
protected static final java.lang.String MANIFEST_LIST
- See Also:
- Constant Field Values
-
OTHERS
protected static final java.lang.String OTHERS
- See Also:
- Constant Field Values
-
FILE_PATH
protected static final java.lang.String FILE_PATH
- See Also:
- Constant Field Values
-
FILE_TYPE
protected static final java.lang.String FILE_TYPE
- See Also:
- Constant Field Values
-
LAST_MODIFIED
protected static final java.lang.String LAST_MODIFIED
- See Also:
- Constant Field Values
-
-
Method Detail
-
self
protected DeleteReachableFilesSparkAction self()
-
io
public DeleteReachableFilesSparkAction io(FileIO fileIO)
Description copied from interface:DeleteReachableFiles
Set theFileIO
to be used for files removal- Specified by:
io
in interfaceDeleteReachableFiles
- Parameters:
fileIO
- FileIO to use for files removal- Returns:
- this for method chaining
-
deleteWith
public DeleteReachableFilesSparkAction deleteWith(java.util.function.Consumer<java.lang.String> newDeleteFunc)
Description copied from interface:DeleteReachableFiles
Passes an alternative delete implementation that will be used for files.- Specified by:
deleteWith
in interfaceDeleteReachableFiles
- Parameters:
newDeleteFunc
- a function that will be called to delete files. The function accepts path to file as an argument.- Returns:
- this for method chaining
-
executeDeleteWith
public DeleteReachableFilesSparkAction executeDeleteWith(java.util.concurrent.ExecutorService executorService)
Description copied from interface:DeleteReachableFiles
Passes an alternative executor service that will be used for files removal.If this method is not called, files will be deleted in the current thread.
- Specified by:
executeDeleteWith
in interfaceDeleteReachableFiles
- Parameters:
executorService
- the service to use- Returns:
- this for method chaining
-
execute
public DeleteReachableFiles.Result execute()
Description copied from interface:Action
Executes this action.- Specified by:
execute
in interfaceAction<DeleteReachableFiles,DeleteReachableFiles.Result>
- Returns:
- the result of this action
-
spark
protected org.apache.spark.sql.SparkSession spark()
-
sparkContext
protected org.apache.spark.api.java.JavaSparkContext sparkContext()
-
option
public ThisT option(java.lang.String name, java.lang.String value)
-
options
public ThisT options(java.util.Map<java.lang.String,java.lang.String> newOptions)
-
options
protected java.util.Map<java.lang.String,java.lang.String> options()
-
withJobGroupInfo
protected <T> T withJobGroupInfo(JobGroupInfo info, java.util.function.Supplier<T> supplier)
-
newJobGroupInfo
protected JobGroupInfo newJobGroupInfo(java.lang.String groupId, java.lang.String desc)
-
newStaticTable
protected Table newStaticTable(TableMetadata metadata, FileIO io)
-
buildValidContentFileWithTypeDF
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> buildValidContentFileWithTypeDF(Table table)
-
buildValidContentFileDF
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> buildValidContentFileDF(Table table)
-
buildManifestFileDF
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> buildManifestFileDF(Table table)
-
buildManifestListDF
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> buildManifestListDF(Table table)
-
buildOtherMetadataFileDF
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> buildOtherMetadataFileDF(Table table)
-
buildAllReachableOtherMetadataFileDF
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> buildAllReachableOtherMetadataFileDF(Table table)
-
buildValidMetadataFileDF
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> buildValidMetadataFileDF(Table table)
-
withFileType
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> withFileType(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> ds, java.lang.String type)
-
loadMetadataTable
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> loadMetadataTable(Table table, MetadataTableType type)
-
-