Package org.apache.iceberg.actions
Interface RewritePositionDeleteFiles
- All Superinterfaces:
Action<RewritePositionDeleteFiles,
,RewritePositionDeleteFiles.Result> SnapshotUpdate<RewritePositionDeleteFiles,
RewritePositionDeleteFiles.Result>
- All Known Implementing Classes:
RewritePositionDeleteFilesSparkAction
public interface RewritePositionDeleteFiles
extends SnapshotUpdate<RewritePositionDeleteFiles,RewritePositionDeleteFiles.Result>
An action for rewriting position delete files.
Generally used for optimizing the size and layout of position delete files within a table.
-
Nested Class Summary
Modifier and TypeInterfaceDescriptionstatic interface
A description of a position delete file group, when it was processed, and within which partition.static interface
For a particular position delete file group, the number of position delete files which are newly created and the number of files which were formerly part of the table but have been rewritten.static interface
The action result that contains a summary of the execution. -
Field Summary
Modifier and TypeFieldDescriptionstatic final String
The max number of file groups to be simultaneously rewritten by the rewrite strategy.static final int
static final String
Enable committing groups of files (see max-file-group-size-bytes) prior to the entire rewrite completing.static final boolean
static final String
The maximum amount of Iceberg commits that this rewrite is allowed to produce if partial progress is enabled.static final int
static final String
Forces the rewrite job order based on the value.static final String
-
Method Summary
Modifier and TypeMethodDescriptionfilter
(Expression expression) A filter for finding deletes to rewrite.Methods inherited from interface org.apache.iceberg.actions.SnapshotUpdate
snapshotProperty
-
Field Details
-
PARTIAL_PROGRESS_ENABLED
Enable committing groups of files (see max-file-group-size-bytes) prior to the entire rewrite completing. This will produce additional commits but allow for progress even if some groups fail to commit. This setting will not change the correctness of the rewrite operation as file groups can be compacted independently.The default is false, which produces a single commit when the entire job has completed.
- See Also:
-
PARTIAL_PROGRESS_ENABLED_DEFAULT
static final boolean PARTIAL_PROGRESS_ENABLED_DEFAULT- See Also:
-
PARTIAL_PROGRESS_MAX_COMMITS
The maximum amount of Iceberg commits that this rewrite is allowed to produce if partial progress is enabled. This setting has no effect if partial progress is disabled.- See Also:
-
PARTIAL_PROGRESS_MAX_COMMITS_DEFAULT
static final int PARTIAL_PROGRESS_MAX_COMMITS_DEFAULT- See Also:
-
MAX_CONCURRENT_FILE_GROUP_REWRITES
The max number of file groups to be simultaneously rewritten by the rewrite strategy. The structure and contents of the group is determined by the rewrite strategy. Each file group will be rewritten independently and asynchronously.- See Also:
-
MAX_CONCURRENT_FILE_GROUP_REWRITES_DEFAULT
static final int MAX_CONCURRENT_FILE_GROUP_REWRITES_DEFAULT- See Also:
-
REWRITE_JOB_ORDER
Forces the rewrite job order based on the value.- If rewrite-job-order=bytes-asc, then rewrite the smallest job groups first.
- If rewrite-job-order=bytes-desc, then rewrite the largest job groups first.
- If rewrite-job-order=files-asc, then rewrite the job groups with the least files first.
- If rewrite-job-order=files-desc, then rewrite the job groups with the most files first.
- If rewrite-job-order=none, then rewrite job groups in the order they were planned (no specific ordering).
Defaults to none.
- See Also:
-
REWRITE_JOB_ORDER_DEFAULT
-
-
Method Details
-
filter
A filter for finding deletes to rewrite.The filter will be converted to a partition filter with an inclusive projection. Any file that may contain rows matching this filter will be used by the action. The matching delete files will be rewritten.
- Parameters:
expression
- An iceberg expression used to find deletes.- Returns:
- this for method chaining
-