public class BaseDeleteReachableFilesSparkAction extends java.lang.Object implements DeleteReachableFiles
DeleteReachableFiles
that uses metadata tables in Spark
to determine which files should be deleted.DeleteReachableFiles.Result
Constructor and Description |
---|
BaseDeleteReachableFilesSparkAction(org.apache.spark.sql.SparkSession spark,
java.lang.String metadataLocation) |
Modifier and Type | Method and Description |
---|---|
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> |
buildManifestFileDF(Table table) |
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> |
buildManifestListDF(Table table) |
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> |
buildOtherMetadataFileDF(Table table) |
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> |
buildValidDataFileDF(Table table) |
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> |
buildValidMetadataFileDF(Table table) |
DeleteReachableFiles |
deleteWith(java.util.function.Consumer<java.lang.String> deleteFunc)
Passes an alternative delete implementation that will be used for files.
|
DeleteReachableFiles.Result |
execute()
Executes this action.
|
DeleteReachableFiles |
executeDeleteWith(java.util.concurrent.ExecutorService executorService)
Passes an alternative executor service that will be used for files removal.
|
DeleteReachableFiles |
io(FileIO fileIO)
Set the
FileIO to be used for files removal |
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> |
loadMetadataTable(Table table,
MetadataTableType type) |
protected JobGroupInfo |
newJobGroupInfo(java.lang.String groupId,
java.lang.String desc) |
protected Table |
newStaticTable(TableMetadata metadata,
FileIO io) |
ThisT |
option(java.lang.String name,
java.lang.String value)
Configures this action with an extra option.
|
protected java.util.Map<java.lang.String,java.lang.String> |
options() |
ThisT |
options(java.util.Map<java.lang.String,java.lang.String> newOptions)
Configures this action with extra options.
|
protected DeleteReachableFiles |
self() |
protected org.apache.spark.sql.SparkSession |
spark() |
protected org.apache.spark.api.java.JavaSparkContext |
sparkContext() |
protected <T> T |
withJobGroupInfo(JobGroupInfo info,
java.util.function.Supplier<T> supplier) |
public BaseDeleteReachableFilesSparkAction(org.apache.spark.sql.SparkSession spark, java.lang.String metadataLocation)
protected DeleteReachableFiles self()
public DeleteReachableFiles io(FileIO fileIO)
DeleteReachableFiles
FileIO
to be used for files removalio
in interface DeleteReachableFiles
fileIO
- FileIO to use for files removalpublic DeleteReachableFiles deleteWith(java.util.function.Consumer<java.lang.String> deleteFunc)
DeleteReachableFiles
deleteWith
in interface DeleteReachableFiles
deleteFunc
- a function that will be called to delete files.
The function accepts path to file as an argument.public DeleteReachableFiles executeDeleteWith(java.util.concurrent.ExecutorService executorService)
DeleteReachableFiles
If this method is not called, files will be deleted in the current thread.
executeDeleteWith
in interface DeleteReachableFiles
executorService
- the service to usepublic DeleteReachableFiles.Result execute()
Action
execute
in interface Action<DeleteReachableFiles,DeleteReachableFiles.Result>
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> buildOtherMetadataFileDF(Table table)
protected org.apache.spark.sql.SparkSession spark()
protected org.apache.spark.api.java.JavaSparkContext sparkContext()
public ThisT option(java.lang.String name, java.lang.String value)
Action
Certain actions allow users to control internal details of their execution via options.
public ThisT options(java.util.Map<java.lang.String,java.lang.String> newOptions)
Action
Certain actions allow users to control internal details of their execution via options.
protected java.util.Map<java.lang.String,java.lang.String> options()
protected <T> T withJobGroupInfo(JobGroupInfo info, java.util.function.Supplier<T> supplier)
protected JobGroupInfo newJobGroupInfo(java.lang.String groupId, java.lang.String desc)
protected Table newStaticTable(TableMetadata metadata, FileIO io)
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> buildValidDataFileDF(Table table)
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> buildManifestFileDF(Table table)
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> buildManifestListDF(Table table)
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> buildValidMetadataFileDF(Table table)
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> loadMetadataTable(Table table, MetadataTableType type)