public class BaseDeleteReachableFilesSparkAction extends java.lang.Object implements DeleteReachableFiles
DeleteReachableFiles that uses metadata tables in Spark
to determine which files should be deleted.DeleteReachableFiles.Result| Constructor and Description |
|---|
BaseDeleteReachableFilesSparkAction(org.apache.spark.sql.SparkSession spark,
java.lang.String metadataLocation) |
| Modifier and Type | Method and Description |
|---|---|
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> |
buildManifestFileDF(Table table) |
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> |
buildManifestListDF(Table table) |
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> |
buildOtherMetadataFileDF(Table table) |
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> |
buildValidDataFileDF(Table table) |
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> |
buildValidMetadataFileDF(Table table) |
DeleteReachableFiles |
deleteWith(java.util.function.Consumer<java.lang.String> deleteFunc)
Passes an alternative delete implementation that will be used for files.
|
DeleteReachableFiles.Result |
execute()
Executes this action.
|
DeleteReachableFiles |
executeDeleteWith(java.util.concurrent.ExecutorService executorService)
Passes an alternative executor service that will be used for files removal.
|
DeleteReachableFiles |
io(FileIO fileIO)
Set the
FileIO to be used for files removal |
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> |
loadMetadataTable(Table table,
MetadataTableType type) |
protected JobGroupInfo |
newJobGroupInfo(java.lang.String groupId,
java.lang.String desc) |
protected Table |
newStaticTable(TableMetadata metadata,
FileIO io) |
ThisT |
option(java.lang.String name,
java.lang.String value)
Configures this action with an extra option.
|
protected java.util.Map<java.lang.String,java.lang.String> |
options() |
ThisT |
options(java.util.Map<java.lang.String,java.lang.String> newOptions)
Configures this action with extra options.
|
protected DeleteReachableFiles |
self() |
protected org.apache.spark.sql.SparkSession |
spark() |
protected org.apache.spark.api.java.JavaSparkContext |
sparkContext() |
protected <T> T |
withJobGroupInfo(JobGroupInfo info,
java.util.function.Supplier<T> supplier) |
public BaseDeleteReachableFilesSparkAction(org.apache.spark.sql.SparkSession spark,
java.lang.String metadataLocation)
protected DeleteReachableFiles self()
public DeleteReachableFiles io(FileIO fileIO)
DeleteReachableFilesFileIO to be used for files removalio in interface DeleteReachableFilesfileIO - FileIO to use for files removalpublic DeleteReachableFiles deleteWith(java.util.function.Consumer<java.lang.String> deleteFunc)
DeleteReachableFilesdeleteWith in interface DeleteReachableFilesdeleteFunc - a function that will be called to delete files.
The function accepts path to file as an argument.public DeleteReachableFiles executeDeleteWith(java.util.concurrent.ExecutorService executorService)
DeleteReachableFilesIf this method is not called, files will be deleted in the current thread.
executeDeleteWith in interface DeleteReachableFilesexecutorService - the service to usepublic DeleteReachableFiles.Result execute()
Actionexecute in interface Action<DeleteReachableFiles,DeleteReachableFiles.Result>protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> buildOtherMetadataFileDF(Table table)
protected org.apache.spark.sql.SparkSession spark()
protected org.apache.spark.api.java.JavaSparkContext sparkContext()
public ThisT option(java.lang.String name,
java.lang.String value)
ActionCertain actions allow users to control internal details of their execution via options.
public ThisT options(java.util.Map<java.lang.String,java.lang.String> newOptions)
ActionCertain actions allow users to control internal details of their execution via options.
protected java.util.Map<java.lang.String,java.lang.String> options()
protected <T> T withJobGroupInfo(JobGroupInfo info, java.util.function.Supplier<T> supplier)
protected JobGroupInfo newJobGroupInfo(java.lang.String groupId, java.lang.String desc)
protected Table newStaticTable(TableMetadata metadata, FileIO io)
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> buildValidDataFileDF(Table table)
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> buildManifestFileDF(Table table)
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> buildManifestListDF(Table table)
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> buildValidMetadataFileDF(Table table)
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> loadMetadataTable(Table table, MetadataTableType type)