Package org.apache.iceberg.actions
Interface RewritePositionDeleteFiles
- All Superinterfaces:
Action<RewritePositionDeleteFiles,,RewritePositionDeleteFiles.Result> SnapshotUpdate<RewritePositionDeleteFiles,RewritePositionDeleteFiles.Result>
- All Known Implementing Classes:
RewritePositionDeleteFilesSparkAction
public interface RewritePositionDeleteFiles
extends SnapshotUpdate<RewritePositionDeleteFiles,RewritePositionDeleteFiles.Result>
An action for rewriting position delete files.
Generally used for optimizing the size and layout of position delete files within a table.
-
Nested Class Summary
Nested ClassesModifier and TypeInterfaceDescriptionstatic interfaceA description of a position delete file group, when it was processed, and within which partition.static interfaceFor a particular position delete file group, the number of position delete files which are newly created and the number of files which were formerly part of the table but have been rewritten.static interfaceThe action result that contains a summary of the execution. -
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final StringThe max number of file groups to be simultaneously rewritten by the rewrite strategy.static final intstatic final StringEnable committing groups of files (see max-file-group-size-bytes) prior to the entire rewrite completing.static final booleanstatic final StringThe maximum amount of Iceberg commits that this rewrite is allowed to produce if partial progress is enabled.static final intstatic final StringForces the rewrite job order based on the value.static final String -
Method Summary
Modifier and TypeMethodDescriptionfilter(Expression expression) A filter for finding deletes to rewrite.Methods inherited from interface org.apache.iceberg.actions.SnapshotUpdate
snapshotProperty
-
Field Details
-
PARTIAL_PROGRESS_ENABLED
Enable committing groups of files (see max-file-group-size-bytes) prior to the entire rewrite completing. This will produce additional commits but allow for progress even if some groups fail to commit. This setting will not change the correctness of the rewrite operation as file groups can be compacted independently.The default is false, which produces a single commit when the entire job has completed.
- See Also:
-
PARTIAL_PROGRESS_ENABLED_DEFAULT
static final boolean PARTIAL_PROGRESS_ENABLED_DEFAULT- See Also:
-
PARTIAL_PROGRESS_MAX_COMMITS
The maximum amount of Iceberg commits that this rewrite is allowed to produce if partial progress is enabled. This setting has no effect if partial progress is disabled.- See Also:
-
PARTIAL_PROGRESS_MAX_COMMITS_DEFAULT
static final int PARTIAL_PROGRESS_MAX_COMMITS_DEFAULT- See Also:
-
MAX_CONCURRENT_FILE_GROUP_REWRITES
The max number of file groups to be simultaneously rewritten by the rewrite strategy. The structure and contents of the group is determined by the rewrite strategy. Each file group will be rewritten independently and asynchronously.- See Also:
-
MAX_CONCURRENT_FILE_GROUP_REWRITES_DEFAULT
static final int MAX_CONCURRENT_FILE_GROUP_REWRITES_DEFAULT- See Also:
-
REWRITE_JOB_ORDER
Forces the rewrite job order based on the value.- If rewrite-job-order=bytes-asc, then rewrite the smallest job groups first.
- If rewrite-job-order=bytes-desc, then rewrite the largest job groups first.
- If rewrite-job-order=files-asc, then rewrite the job groups with the least files first.
- If rewrite-job-order=files-desc, then rewrite the job groups with the most files first.
- If rewrite-job-order=none, then rewrite job groups in the order they were planned (no specific ordering).
Defaults to none.
- See Also:
-
REWRITE_JOB_ORDER_DEFAULT
-
-
Method Details
-
filter
A filter for finding deletes to rewrite.The filter will be converted to a partition filter with an inclusive projection. Any file that may contain rows matching this filter will be used by the action. The matching delete files will be rewritten.
- Parameters:
expression- An iceberg expression used to find deletes.- Returns:
- this for method chaining
-