Interface DeleteOrphanFiles
-
- All Superinterfaces:
Action<DeleteOrphanFiles,DeleteOrphanFiles.Result>
- All Known Implementing Classes:
DeleteOrphanFilesSparkAction
public interface DeleteOrphanFiles extends Action<DeleteOrphanFiles,DeleteOrphanFiles.Result>
An action that deletes orphan metadata, data and delete files in a table.A file is considered orphan if it is not reachable by any valid snapshot. The set of actual files is built by listing the underlying storage which makes this operation expensive.
-
-
Nested Class Summary
Nested Classes Modifier and Type Interface Description static class
DeleteOrphanFiles.PrefixMismatchMode
Defines the action behavior when location prefixes (scheme/authority) mismatch.static interface
DeleteOrphanFiles.Result
The action result that contains a summary of the execution.
-
Method Summary
All Methods Instance Methods Abstract Methods Default Methods Modifier and Type Method Description DeleteOrphanFiles
deleteWith(java.util.function.Consumer<java.lang.String> deleteFunc)
Passes an alternative delete implementation that will be used for orphan files.default DeleteOrphanFiles
equalAuthorities(java.util.Map<java.lang.String,java.lang.String> newEqualAuthorities)
Passes authorities that should be considered equal.default DeleteOrphanFiles
equalSchemes(java.util.Map<java.lang.String,java.lang.String> newEqualSchemes)
Passes schemes that should be considered equal.DeleteOrphanFiles
executeDeleteWith(java.util.concurrent.ExecutorService executorService)
Passes an alternative executor service that will be used for removing orphaned files.DeleteOrphanFiles
location(java.lang.String location)
Passes a location which should be scanned for orphan files.DeleteOrphanFiles
olderThan(long olderThanTimestamp)
Removes orphan files only if they are older than the given timestamp.default DeleteOrphanFiles
prefixMismatchMode(DeleteOrphanFiles.PrefixMismatchMode newPrefixMismatchMode)
Passes a prefix mismatch mode that determines how this action should handle situations when the metadata references files that match listed/provided files except for authority/scheme.
-
-
-
Method Detail
-
location
DeleteOrphanFiles location(java.lang.String location)
Passes a location which should be scanned for orphan files.If not set, the root table location will be scanned potentially removing both orphan data and metadata files.
- Parameters:
location
- the location where to look for orphan files- Returns:
- this for method chaining
-
olderThan
DeleteOrphanFiles olderThan(long olderThanTimestamp)
Removes orphan files only if they are older than the given timestamp.This is a safety measure to avoid removing files that are being added to the table. For example, there may be a concurrent operation adding new files while this action searches for orphan files. New files may not be referenced by the metadata yet but they are not orphan.
If not set, defaults to a timestamp 3 days ago.
- Parameters:
olderThanTimestamp
- a long timestamp, as returned bySystem.currentTimeMillis()
- Returns:
- this for method chaining
-
deleteWith
DeleteOrphanFiles deleteWith(java.util.function.Consumer<java.lang.String> deleteFunc)
Passes an alternative delete implementation that will be used for orphan files.This method allows users to customize the delete function. For example, one may set a custom delete func and collect all orphan files into a set instead of physically removing them.
If not set, defaults to using the table's
io
implementation.- Parameters:
deleteFunc
- a function that will be called to delete files- Returns:
- this for method chaining
-
executeDeleteWith
DeleteOrphanFiles executeDeleteWith(java.util.concurrent.ExecutorService executorService)
Passes an alternative executor service that will be used for removing orphaned files. This service will only be used if a custom delete function is provided bydeleteWith(Consumer)
or if the FileIO does notsupport bulk deletes
. Otherwise, parallelism should be controlled by the IO specificdeleteFiles
method.If this method is not called and bulk deletes are not supported, orphaned manifests and data files will still be deleted in the current thread.
- Parameters:
executorService
- the service to use- Returns:
- this for method chaining
-
prefixMismatchMode
default DeleteOrphanFiles prefixMismatchMode(DeleteOrphanFiles.PrefixMismatchMode newPrefixMismatchMode)
Passes a prefix mismatch mode that determines how this action should handle situations when the metadata references files that match listed/provided files except for authority/scheme.Possible values are "ERROR", "IGNORE", "DELETE". The default mismatch mode is "ERROR", which means an exception is thrown whenever there is a mismatch in authority/scheme. It's the recommended mismatch mode and should be changed only in some rare circumstances. If there is a mismatch, use
equalSchemes(Map)
andequalAuthorities(Map)
to resolve conflicts by providing equivalent schemes and authorities. If it is impossible to determine whether the conflicting authorities/schemes are equal, set the prefix mismatch mode to "IGNORE" to skip files with mismatches. If you have manually inspected all conflicting authorities/schemes, provided equivalent schemes/authorities and are absolutely confident the remaining ones are different, set the prefix mismatch mode to "DELETE" to consider files with mismatches as orphan. It will be impossible to recover files after deletion, so the "DELETE" prefix mismatch mode must be used with extreme caution.- Parameters:
newPrefixMismatchMode
- mode for handling prefix mismatches- Returns:
- this for method chaining
-
equalSchemes
default DeleteOrphanFiles equalSchemes(java.util.Map<java.lang.String,java.lang.String> newEqualSchemes)
Passes schemes that should be considered equal.The key may include a comma-separated list of schemes. For instance, Map("s3a,s3,s3n", "s3").
- Parameters:
newEqualSchemes
- list of equal schemes- Returns:
- this for method chaining
-
equalAuthorities
default DeleteOrphanFiles equalAuthorities(java.util.Map<java.lang.String,java.lang.String> newEqualAuthorities)
Passes authorities that should be considered equal.The key may include a comma-separate list of authorities. For instance, Map("s1name,s2name", "servicename").
- Parameters:
newEqualAuthorities
- list of equal authorities- Returns:
- this for method chaining
-
-