Class BaseRewriteManifests
- java.lang.Object
-
- org.apache.iceberg.BaseRewriteManifests
-
- All Implemented Interfaces:
PendingUpdate<Snapshot>
,RewriteManifests
,SnapshotUpdate<RewriteManifests>
public class BaseRewriteManifests extends java.lang.Object implements RewriteManifests
-
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description RewriteManifests
addManifest(ManifestFile manifest)
Adds amanifest file
to the table.Snapshot
apply()
Apply the pending changes and return the uncommitted changes for validation.java.util.List<ManifestFile>
apply(TableMetadata base, Snapshot snapshot)
Apply the update's changes to the given metadata and snapshot.protected void
cleanAll()
protected void
cleanUncommitted(java.util.Set<ManifestFile> committed)
Clean up any uncommitted manifests that were created.RewriteManifests
clusterBy(java.util.function.Function<DataFile,java.lang.Object> func)
Groups an existingDataFile
by a cluster key produced by a function.void
commit()
Apply the pending changes and commit.protected CommitMetrics
commitMetrics()
protected TableMetadata
current()
protected void
deleteFile(java.lang.String path)
RewriteManifests
deleteManifest(ManifestFile manifest)
Deletes amanifest file
from the table.ThisT
deleteWith(java.util.function.Consumer<java.lang.String> deleteCallback)
Set a callback to delete files instead of the table's default.protected OutputFile
manifestListPath()
protected ManifestReader<DeleteFile>
newDeleteManifestReader(ManifestFile manifest)
protected ManifestWriter<DeleteFile>
newDeleteManifestWriter(PartitionSpec spec)
protected OutputFile
newManifestOutput()
protected ManifestReader<DataFile>
newManifestReader(ManifestFile manifest)
protected ManifestWriter<DataFile>
newManifestWriter(PartitionSpec spec)
protected RollingManifestWriter<DeleteFile>
newRollingDeleteManifestWriter(PartitionSpec spec)
protected RollingManifestWriter<DataFile>
newRollingManifestWriter(PartitionSpec spec)
protected java.lang.String
operation()
A string that describes the action that produced the new snapshot.protected TableMetadata
refresh()
protected ThisT
reportWith(MetricsReporter newReporter)
RewriteManifests
rewriteIf(java.util.function.Predicate<ManifestFile> pred)
Determines which existingManifestFile
for the table should be rewritten.ThisT
scanManifestsWith(java.util.concurrent.ExecutorService executorService)
Use a particular executor to scan manifests.protected RewriteManifests
self()
RewriteManifests
set(java.lang.String property, java.lang.String value)
Set a summary property in the snapshot produced by this update.protected long
snapshotId()
ThisT
stageOnly()
Called to stage a snapshot in table metadata, but not update the current snapshot id.protected java.util.Map<java.lang.String,java.lang.String>
summary()
protected java.lang.String
targetBranch()
protected void
targetBranch(java.lang.String branch)
A setter for the target branch on which snapshot producer operation should be performedprotected void
validate(TableMetadata currentMetadata, Snapshot snapshot)
Validate the current metadata.protected java.util.concurrent.ExecutorService
workerPool()
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface org.apache.iceberg.PendingUpdate
apply, commit, updateEvent
-
Methods inherited from interface org.apache.iceberg.SnapshotUpdate
deleteWith, scanManifestsWith, stageOnly, toBranch
-
-
-
-
Method Detail
-
self
protected RewriteManifests self()
-
operation
protected java.lang.String operation()
A string that describes the action that produced the new snapshot.- Returns:
- a string operation
-
set
public RewriteManifests set(java.lang.String property, java.lang.String value)
Description copied from interface:SnapshotUpdate
Set a summary property in the snapshot produced by this update.- Specified by:
set
in interfaceSnapshotUpdate<RewriteManifests>
- Parameters:
property
- a String property namevalue
- a String property value- Returns:
- this for method chaining
-
summary
protected java.util.Map<java.lang.String,java.lang.String> summary()
-
clusterBy
public RewriteManifests clusterBy(java.util.function.Function<DataFile,java.lang.Object> func)
Description copied from interface:RewriteManifests
Groups an existingDataFile
by a cluster key produced by a function. The cluster key will determine which data file will be associated with a particular manifest. All data files with the same cluster key will be written to the same manifest (unless the file is large and split into multiple files). Manifests deleted viaRewriteManifests.deleteManifest(ManifestFile)
or added viaRewriteManifests.addManifest(ManifestFile)
are ignored during the rewrite process.- Specified by:
clusterBy
in interfaceRewriteManifests
- Parameters:
func
- Function used to cluster data files to manifests.- Returns:
- this for method chaining
-
rewriteIf
public RewriteManifests rewriteIf(java.util.function.Predicate<ManifestFile> pred)
Description copied from interface:RewriteManifests
Determines which existingManifestFile
for the table should be rewritten. Manifests that do not match the predicate are kept as-is. If this is not called and no predicate is set, then all manifests will be rewritten.- Specified by:
rewriteIf
in interfaceRewriteManifests
- Parameters:
pred
- Predicate used to determine which manifests to rewrite. If true then the manifest file will be included for rewrite. If false then then manifest is kept as-is.- Returns:
- this for method chaining
-
deleteManifest
public RewriteManifests deleteManifest(ManifestFile manifest)
Description copied from interface:RewriteManifests
Deletes amanifest file
from the table.- Specified by:
deleteManifest
in interfaceRewriteManifests
- Parameters:
manifest
- a manifest to delete- Returns:
- this for method chaining
-
addManifest
public RewriteManifests addManifest(ManifestFile manifest)
Description copied from interface:RewriteManifests
Adds amanifest file
to the table. The added manifest cannot contain new or deleted files.By default, the manifest will be rewritten to ensure all entries have explicit snapshot IDs. In that case, it is always the responsibility of the caller to manage the lifecycle of the original manifest.
If manifest entries are allowed to inherit the snapshot ID assigned on commit, the manifest should never be deleted manually if the commit succeeds as it will become part of the table metadata and will be cleaned up on expiry. If the manifest gets merged with others while preparing a new snapshot, it will be deleted automatically if this operation is successful. If the commit fails, the manifest will never be deleted and it is up to the caller whether to delete or reuse it.
- Specified by:
addManifest
in interfaceRewriteManifests
- Parameters:
manifest
- a manifest to add- Returns:
- this for method chaining
-
apply
public java.util.List<ManifestFile> apply(TableMetadata base, Snapshot snapshot)
Apply the update's changes to the given metadata and snapshot. Return the new manifest list.- Parameters:
base
- the base table metadata to apply changes tosnapshot
- snapshot to apply the changes to- Returns:
- a manifest list for the new snapshot.
-
cleanUncommitted
protected void cleanUncommitted(java.util.Set<ManifestFile> committed)
Clean up any uncommitted manifests that were created.Manifests may not be committed if apply is called more because a commit conflict has occurred. Implementations may keep around manifests because the same changes will be made by both apply calls. This method instructs the implementation to clean up those manifests and passes the paths of the manifests that were actually committed.
- Parameters:
committed
- a set of manifest paths that were actually committed
-
stageOnly
public ThisT stageOnly()
Description copied from interface:SnapshotUpdate
Called to stage a snapshot in table metadata, but not update the current snapshot id.- Specified by:
stageOnly
in interfaceSnapshotUpdate<ThisT>
- Returns:
- this for method chaining
-
scanManifestsWith
public ThisT scanManifestsWith(java.util.concurrent.ExecutorService executorService)
Description copied from interface:SnapshotUpdate
Use a particular executor to scan manifests. The default worker pool will be used by default.- Specified by:
scanManifestsWith
in interfaceSnapshotUpdate<ThisT>
- Parameters:
executorService
- the provided executor- Returns:
- this for method chaining
-
commitMetrics
protected CommitMetrics commitMetrics()
-
reportWith
protected ThisT reportWith(MetricsReporter newReporter)
-
targetBranch
protected void targetBranch(java.lang.String branch)
A setter for the target branch on which snapshot producer operation should be performed- Parameters:
branch
- to set as target branch
-
targetBranch
protected java.lang.String targetBranch()
-
workerPool
protected java.util.concurrent.ExecutorService workerPool()
-
deleteWith
public ThisT deleteWith(java.util.function.Consumer<java.lang.String> deleteCallback)
Description copied from interface:SnapshotUpdate
Set a callback to delete files instead of the table's default.- Specified by:
deleteWith
in interfaceSnapshotUpdate<ThisT>
- Parameters:
deleteCallback
- a String consumer used to delete locations.- Returns:
- this for method chaining
-
validate
protected void validate(TableMetadata currentMetadata, Snapshot snapshot)
Validate the current metadata.Child operations can override this to add custom validation.
- Parameters:
currentMetadata
- current table metadata to validatesnapshot
- ending snapshot on the lineage which is being validated
-
apply
public Snapshot apply()
Description copied from interface:PendingUpdate
Apply the pending changes and return the uncommitted changes for validation.This does not result in a permanent update.
- Specified by:
apply
in interfacePendingUpdate<ThisT>
- Returns:
- the uncommitted changes that would be committed by calling
PendingUpdate.commit()
-
current
protected TableMetadata current()
-
refresh
protected TableMetadata refresh()
-
commit
public void commit()
Description copied from interface:PendingUpdate
Apply the pending changes and commit.Changes are committed by calling the underlying table's commit method.
Once the commit is successful, the updated table will be refreshed.
- Specified by:
commit
in interfacePendingUpdate<ThisT>
-
cleanAll
protected void cleanAll()
-
deleteFile
protected void deleteFile(java.lang.String path)
-
manifestListPath
protected OutputFile manifestListPath()
-
newManifestOutput
protected OutputFile newManifestOutput()
-
newManifestWriter
protected ManifestWriter<DataFile> newManifestWriter(PartitionSpec spec)
-
newDeleteManifestWriter
protected ManifestWriter<DeleteFile> newDeleteManifestWriter(PartitionSpec spec)
-
newRollingManifestWriter
protected RollingManifestWriter<DataFile> newRollingManifestWriter(PartitionSpec spec)
-
newRollingDeleteManifestWriter
protected RollingManifestWriter<DeleteFile> newRollingDeleteManifestWriter(PartitionSpec spec)
-
newManifestReader
protected ManifestReader<DataFile> newManifestReader(ManifestFile manifest)
-
newDeleteManifestReader
protected ManifestReader<DeleteFile> newDeleteManifestReader(ManifestFile manifest)
-
snapshotId
protected long snapshotId()
-
-