public class DeleteReachableFilesSparkAction extends java.lang.Object implements DeleteReachableFiles
DeleteReachableFiles that uses metadata tables in Spark to determine
 which files should be deleted.DeleteReachableFiles.Result| Modifier and Type | Field and Description | 
|---|---|
| protected static org.apache.iceberg.relocated.com.google.common.base.Joiner | COMMA_JOINER | 
| protected static org.apache.iceberg.relocated.com.google.common.base.Splitter | COMMA_SPLITTER | 
| protected static java.lang.String | FILE_PATH | 
| protected static java.lang.String | LAST_MODIFIED | 
| protected static java.lang.String | MANIFEST | 
| protected static java.lang.String | MANIFEST_LIST | 
| protected static java.lang.String | OTHERS | 
| protected static java.lang.String | STATISTICS_FILES | 
| static java.lang.String | STREAM_RESULTS | 
| static boolean | STREAM_RESULTS_DEFAULT | 
| Modifier and Type | Method and Description | 
|---|---|
| protected org.apache.spark.sql.Dataset<FileInfo> | allReachableOtherMetadataFileDS(Table table) | 
| protected org.apache.spark.sql.Dataset<FileInfo> | contentFileDS(Table table) | 
| protected org.apache.spark.sql.Dataset<FileInfo> | contentFileDS(Table table,
             java.util.Set<java.lang.Long> snapshotIds) | 
| protected org.apache.iceberg.spark.actions.BaseSparkAction.DeleteSummary | deleteFiles(java.util.concurrent.ExecutorService executorService,
           java.util.function.Consumer<java.lang.String> deleteFunc,
           java.util.Iterator<FileInfo> files)Deletes files and keeps track of how many files were removed for each file type. | 
| protected org.apache.iceberg.spark.actions.BaseSparkAction.DeleteSummary | deleteFiles(SupportsBulkOperations io,
           java.util.Iterator<FileInfo> files) | 
| DeleteReachableFilesSparkAction | deleteWith(java.util.function.Consumer<java.lang.String> newDeleteFunc)Passes an alternative delete implementation that will be used for files. | 
| DeleteReachableFiles.Result | execute()Executes this action. | 
| DeleteReachableFilesSparkAction | executeDeleteWith(java.util.concurrent.ExecutorService executorService)Passes an alternative executor service that will be used for files removal. | 
| DeleteReachableFilesSparkAction | io(FileIO fileIO)Set the  FileIOto be used for files removal | 
| protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> | loadMetadataTable(Table table,
                 MetadataTableType type) | 
| protected org.apache.spark.sql.Dataset<FileInfo> | manifestDS(Table table) | 
| protected org.apache.spark.sql.Dataset<FileInfo> | manifestDS(Table table,
          java.util.Set<java.lang.Long> snapshotIds) | 
| protected org.apache.spark.sql.Dataset<FileInfo> | manifestListDS(Table table) | 
| protected org.apache.spark.sql.Dataset<FileInfo> | manifestListDS(Table table,
              java.util.Set<java.lang.Long> snapshotIds) | 
| protected JobGroupInfo | newJobGroupInfo(java.lang.String groupId,
               java.lang.String desc) | 
| protected Table | newStaticTable(TableMetadata metadata,
              FileIO io) | 
| ThisT | option(java.lang.String name,
      java.lang.String value) | 
| protected java.util.Map<java.lang.String,java.lang.String> | options() | 
| ThisT | options(java.util.Map<java.lang.String,java.lang.String> newOptions) | 
| protected org.apache.spark.sql.Dataset<FileInfo> | otherMetadataFileDS(Table table) | 
| protected DeleteReachableFilesSparkAction | self() | 
| protected org.apache.spark.sql.SparkSession | spark() | 
| protected org.apache.spark.api.java.JavaSparkContext | sparkContext() | 
| protected org.apache.spark.sql.Dataset<FileInfo> | statisticsFileDS(Table table,
                java.util.Set<java.lang.Long> snapshotIds) | 
| protected <T> T | withJobGroupInfo(JobGroupInfo info,
                java.util.function.Supplier<T> supplier) | 
public static final java.lang.String STREAM_RESULTS
public static final boolean STREAM_RESULTS_DEFAULT
protected static final java.lang.String MANIFEST
protected static final java.lang.String MANIFEST_LIST
protected static final java.lang.String STATISTICS_FILES
protected static final java.lang.String OTHERS
protected static final java.lang.String FILE_PATH
protected static final java.lang.String LAST_MODIFIED
protected static final org.apache.iceberg.relocated.com.google.common.base.Splitter COMMA_SPLITTER
protected static final org.apache.iceberg.relocated.com.google.common.base.Joiner COMMA_JOINER
protected DeleteReachableFilesSparkAction self()
public DeleteReachableFilesSparkAction io(FileIO fileIO)
DeleteReachableFilesFileIO to be used for files removalio in interface DeleteReachableFilesfileIO - FileIO to use for files removalpublic DeleteReachableFilesSparkAction deleteWith(java.util.function.Consumer<java.lang.String> newDeleteFunc)
DeleteReachableFilesdeleteWith in interface DeleteReachableFilesnewDeleteFunc - a function that will be called to delete files. The function accepts path to
     file as an argument.public DeleteReachableFilesSparkAction executeDeleteWith(java.util.concurrent.ExecutorService executorService)
DeleteReachableFilesDeleteReachableFiles.deleteWith(Consumer) or if the
 FileIO does not support bulk deletes. Otherwise, parallelism
 should be controlled by the IO specific deleteFiles method.executeDeleteWith in interface DeleteReachableFilesexecutorService - the service to usepublic DeleteReachableFiles.Result execute()
Actionexecute in interface Action<DeleteReachableFiles,DeleteReachableFiles.Result>protected org.apache.spark.sql.SparkSession spark()
protected org.apache.spark.api.java.JavaSparkContext sparkContext()
public ThisT option(java.lang.String name,
                    java.lang.String value)
public ThisT options(java.util.Map<java.lang.String,java.lang.String> newOptions)
protected java.util.Map<java.lang.String,java.lang.String> options()
protected <T> T withJobGroupInfo(JobGroupInfo info, java.util.function.Supplier<T> supplier)
protected JobGroupInfo newJobGroupInfo(java.lang.String groupId, java.lang.String desc)
protected Table newStaticTable(TableMetadata metadata, FileIO io)
protected org.apache.spark.sql.Dataset<FileInfo> contentFileDS(Table table, java.util.Set<java.lang.Long> snapshotIds)
protected org.apache.spark.sql.Dataset<FileInfo> manifestDS(Table table, java.util.Set<java.lang.Long> snapshotIds)
protected org.apache.spark.sql.Dataset<FileInfo> manifestListDS(Table table, java.util.Set<java.lang.Long> snapshotIds)
protected org.apache.spark.sql.Dataset<FileInfo> statisticsFileDS(Table table, java.util.Set<java.lang.Long> snapshotIds)
protected org.apache.spark.sql.Dataset<FileInfo> otherMetadataFileDS(Table table)
protected org.apache.spark.sql.Dataset<FileInfo> allReachableOtherMetadataFileDS(Table table)
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> loadMetadataTable(Table table, MetadataTableType type)
protected org.apache.iceberg.spark.actions.BaseSparkAction.DeleteSummary deleteFiles(java.util.concurrent.ExecutorService executorService,
                                                                                     java.util.function.Consumer<java.lang.String> deleteFunc,
                                                                                     java.util.Iterator<FileInfo> files)
executorService - an executor service to use for parallel deletesdeleteFunc - a delete funcfiles - an iterator of Spark rows of the structure (path: String, type: String)protected org.apache.iceberg.spark.actions.BaseSparkAction.DeleteSummary deleteFiles(SupportsBulkOperations io, java.util.Iterator<FileInfo> files)