Interface RewriteManifests
- All Superinterfaces:
PendingUpdate<Snapshot>
,SnapshotUpdate<RewriteManifests>
- All Known Implementing Classes:
BaseRewriteManifests
This API accumulates manifest files, produces a new Snapshot
of the table described
only by the manifest files that were added, and commits that snapshot as the current.
This API can be used to rewrite matching manifests according to a clustering function as well as to replace specific manifests. Manifests that are deleted or added directly are ignored during the rewrite process. The set of active files in replaced manifests must be the same as in new manifests.
When committing, these changes will be applied to the latest table snapshot. Commit conflicts will be resolved by applying the changes to the new latest snapshot and reattempting the commit.
-
Method Summary
Modifier and TypeMethodDescriptionaddManifest
(ManifestFile manifest) Adds amanifest file
to the table.Groups an existingDataFile
by a cluster key produced by a function.deleteManifest
(ManifestFile manifest) Deletes amanifest file
from the table.rewriteIf
(Predicate<ManifestFile> predicate) Determines which existingManifestFile
for the table should be rewritten.Methods inherited from interface org.apache.iceberg.PendingUpdate
apply, commit, updateEvent
Methods inherited from interface org.apache.iceberg.SnapshotUpdate
deleteWith, scanManifestsWith, set, stageOnly, toBranch
-
Method Details
-
clusterBy
Groups an existingDataFile
by a cluster key produced by a function. The cluster key will determine which data file will be associated with a particular manifest. All data files with the same cluster key will be written to the same manifest (unless the file is large and split into multiple files). Manifests deleted viadeleteManifest(ManifestFile)
or added viaaddManifest(ManifestFile)
are ignored during the rewrite process.- Parameters:
func
- Function used to cluster data files to manifests.- Returns:
- this for method chaining
-
rewriteIf
Determines which existingManifestFile
for the table should be rewritten. Manifests that do not match the predicate are kept as-is. If this is not called and no predicate is set, then all manifests will be rewritten.- Parameters:
predicate
- Predicate used to determine which manifests to rewrite. If true then the manifest file will be included for rewrite. If false then then manifest is kept as-is.- Returns:
- this for method chaining
-
deleteManifest
Deletes amanifest file
from the table.- Parameters:
manifest
- a manifest to delete- Returns:
- this for method chaining
-
addManifest
Adds amanifest file
to the table. The added manifest cannot contain new or deleted files.By default, the manifest will be rewritten to ensure all entries have explicit snapshot IDs. In that case, it is always the responsibility of the caller to manage the lifecycle of the original manifest.
If manifest entries are allowed to inherit the snapshot ID assigned on commit, the manifest should never be deleted manually if the commit succeeds as it will become part of the table metadata and will be cleaned up on expiry. If the manifest gets merged with others while preparing a new snapshot, it will be deleted automatically if this operation is successful. If the commit fails, the manifest will never be deleted and it is up to the caller whether to delete or reuse it.
- Parameters:
manifest
- a manifest to add- Returns:
- this for method chaining
-