public class BaseReplacePartitions extends java.lang.Object implements ReplacePartitions
| Modifier and Type | Method and Description | 
|---|---|
| protected void | add(DataFile file)Add a data file to the new snapshot. | 
| protected void | add(DeleteFile file)Add a delete file to the new snapshot. | 
| protected void | add(ManifestFile manifest)Add all files in a manifest to the new snapshot. | 
| protected org.apache.iceberg.DeleteFileIndex | addedDeleteFiles(TableMetadata base,
                java.lang.Long startingSnapshotId,
                Expression dataFilter,
                PartitionSet partitionSet)Deprecated. 
 will be removed in 1.3.0; use  MergingSnapshotProducer.addedDeleteFiles(TableMetadata, Long, Expression, PartitionSet,
     Snapshot)instead | 
| protected org.apache.iceberg.DeleteFileIndex | addedDeleteFiles(TableMetadata base,
                java.lang.Long startingSnapshotId,
                Expression dataFilter,
                PartitionSet partitionSet,
                Snapshot parent)Returns matching delete files have been added to the table since a starting snapshot. | 
| protected java.util.List<DataFile> | addedFiles() | 
| ReplacePartitions | addFile(DataFile file)Add a  DataFileto the table. | 
| Snapshot | apply()Apply the pending changes and return the uncommitted changes for validation. | 
| java.util.List<ManifestFile> | apply(TableMetadata base,
     Snapshot snapshot)Apply the update's changes to the given metadata and snapshot. | 
| ThisT | caseSensitive(boolean isCaseSensitive) | 
| protected void | cleanAll() | 
| protected void | cleanUncommitted(java.util.Set<ManifestFile> committed)Clean up any uncommitted manifests that were created. | 
| void | commit()Apply the pending changes and commit. | 
| protected CommitMetrics | commitMetrics() | 
| protected TableMetadata | current() | 
| protected PartitionSpec | dataSpec() | 
| protected void | delete(java.lang.CharSequence path)Add a specific data path to be deleted in the new snapshot. | 
| protected void | delete(DataFile file)Add a specific data file to be deleted in the new snapshot. | 
| protected void | delete(DeleteFile file)Add a specific delete file to be deleted in the new snapshot. | 
| protected void | deleteByRowFilter(Expression expr)Add a filter to match files to delete. | 
| protected void | deleteFile(java.lang.String path) | 
| ThisT | deleteWith(java.util.function.Consumer<java.lang.String> deleteCallback)Set a callback to delete files instead of the table's default. | 
| protected void | dropPartition(int specId,
             StructLike partition)Add a partition tuple to drop from the table during the delete phase. | 
| protected void | failAnyDelete() | 
| protected void | failMissingDeletePaths() | 
| protected boolean | isCaseSensitive() | 
| protected OutputFile | manifestListPath() | 
| protected ManifestReader<DeleteFile> | newDeleteManifestReader(ManifestFile manifest) | 
| protected ManifestWriter<DeleteFile> | newDeleteManifestWriter(PartitionSpec spec) | 
| protected OutputFile | newManifestOutput() | 
| protected ManifestReader<DataFile> | newManifestReader(ManifestFile manifest) | 
| protected ManifestWriter<DataFile> | newManifestWriter(PartitionSpec spec) | 
| protected java.lang.String | operation()A string that describes the action that produced the new snapshot. | 
| protected TableMetadata | refresh() | 
| protected ThisT | reportWith(MetricsReporter newReporter) | 
| protected Expression | rowFilter() | 
| ThisT | scanManifestsWith(java.util.concurrent.ExecutorService executorService)Use a particular executor to scan manifests. | 
| protected ReplacePartitions | self() | 
| ThisT | set(java.lang.String property,
   java.lang.String value)Set a summary property in the snapshot produced by this update. | 
| protected void | setNewFilesSequenceNumber(long sequenceNumber) | 
| protected long | snapshotId() | 
| ThisT | stageOnly()Called to stage a snapshot in table metadata, but not update the current snapshot id. | 
| protected java.util.Map<java.lang.String,java.lang.String> | summary() | 
| protected java.lang.String | targetBranch() | 
| protected void | targetBranch(java.lang.String branch)* A setter for the target branch on which snapshot producer operation should be performed | 
| BaseReplacePartitions | toBranch(java.lang.String branch)Perform operations on a particular branch | 
| java.lang.Object | updateEvent()Generates update event to notify about metadata changes | 
| void | validate(TableMetadata currentMetadata,
        Snapshot parent)Validate the current metadata. | 
| protected void | validateAddedDataFiles(TableMetadata base,
                      java.lang.Long startingSnapshotId,
                      Expression conflictDetectionFilter)Deprecated. 
 will be removed in 1.3.0; use  MergingSnapshotProducer.validateAddedDataFiles(TableMetadata, Long, Expression, Snapshot)instead | 
| protected void | validateAddedDataFiles(TableMetadata base,
                      java.lang.Long startingSnapshotId,
                      Expression conflictDetectionFilter,
                      Snapshot parent)Validates that no files matching a filter have been added to the table since a starting
 snapshot. | 
| protected void | validateAddedDataFiles(TableMetadata base,
                      java.lang.Long startingSnapshotId,
                      PartitionSet partitionSet)Deprecated. 
 will be removed in 1.3.0; use  MergingSnapshotProducer.validateAddedDataFiles(TableMetadata, Long, PartitionSet,
     Snapshot)instead | 
| protected void | validateAddedDataFiles(TableMetadata base,
                      java.lang.Long startingSnapshotId,
                      PartitionSet partitionSet,
                      Snapshot parent)Validates that no files matching given partitions have been added to the table since a starting
 snapshot. | 
| ReplacePartitions | validateAppendOnly()Validate that no partitions will be replaced and the operation is append-only. | 
| protected void | validateDataFilesExist(TableMetadata base,
                      java.lang.Long startingSnapshotId,
                      CharSequenceSet requiredDataFiles,
                      boolean skipDeletes,
                      Expression conflictDetectionFilter)Deprecated.  | 
| protected void | validateDataFilesExist(TableMetadata base,
                      java.lang.Long startingSnapshotId,
                      CharSequenceSet requiredDataFiles,
                      boolean skipDeletes,
                      Expression conflictDetectionFilter,
                      Snapshot parent) | 
| protected void | validateDeletedDataFiles(TableMetadata base,
                        java.lang.Long startingSnapshotId,
                        Expression dataFilter)Deprecated. 
 will be removed in 1.3.0; use  MergingSnapshotProducer.validateDeletedDataFiles(TableMetadata, Long, Expression,
     Snapshot)instead | 
| protected void | validateDeletedDataFiles(TableMetadata base,
                        java.lang.Long startingSnapshotId,
                        Expression dataFilter,
                        Snapshot parent)Validates that no files matching a filter have been deleted from the table since a starting
 snapshot. | 
| protected void | validateDeletedDataFiles(TableMetadata base,
                        java.lang.Long startingSnapshotId,
                        PartitionSet partitionSet)Deprecated. 
 will be removed in 1.3.0; use  MergingSnapshotProducer.validateNoNewDeleteFiles(TableMetadata, Long, PartitionSet,
     Snapshot)instead | 
| protected void | validateDeletedDataFiles(TableMetadata base,
                        java.lang.Long startingSnapshotId,
                        PartitionSet partitionSet,
                        Snapshot parent)Validates that no files matching a filter have been deleted from the table since a starting
 snapshot. | 
| ReplacePartitions | validateFromSnapshot(long newStartingSnapshotId)Set the snapshot ID used in validations for this operation. | 
| ReplacePartitions | validateNoConflictingData()Enables validation that data added concurrently does not conflict with this commit's operation. | 
| ReplacePartitions | validateNoConflictingDeletes()Enables validation that deletes that happened concurrently do not conflict with this commit's
 operation. | 
| protected void | validateNoNewDeleteFiles(TableMetadata base,
                        java.lang.Long startingSnapshotId,
                        Expression dataFilter)Deprecated. 
 will be removed in 1.3.0; use  MergingSnapshotProducer.validateNoNewDeleteFiles(org.apache.iceberg.TableMetadata, java.lang.Long, org.apache.iceberg.expressions.Expression)instead | 
| protected void | validateNoNewDeleteFiles(TableMetadata base,
                        java.lang.Long startingSnapshotId,
                        Expression dataFilter,
                        Snapshot parent)Validates that no delete files matching a filter have been added to the table since a starting
 snapshot. | 
| protected void | validateNoNewDeleteFiles(TableMetadata base,
                        java.lang.Long startingSnapshotId,
                        PartitionSet partitionSet)Deprecated. 
 will be removed in 1.3.0; use  MergingSnapshotProducer.validateNoNewDeleteFiles(TableMetadata, Long, PartitionSet,
     Snapshot)instead | 
| protected void | validateNoNewDeleteFiles(TableMetadata base,
                        java.lang.Long startingSnapshotId,
                        PartitionSet partitionSet,
                        Snapshot parent)Validates that no delete files matching a partition set have been added to the table since a
 starting snapshot. | 
| protected void | validateNoNewDeletesForDataFiles(TableMetadata base,
                                java.lang.Long startingSnapshotId,
                                Expression dataFilter,
                                java.lang.Iterable<DataFile> dataFiles)Deprecated. 
 will be removed in 1.3.0; use  MergingSnapshotProducer.validateNoNewDeletesForDataFiles(org.apache.iceberg.TableMetadata, java.lang.Long, java.lang.Iterable<org.apache.iceberg.DataFile>, org.apache.iceberg.Snapshot)instead | 
| protected void | validateNoNewDeletesForDataFiles(TableMetadata base,
                                java.lang.Long startingSnapshotId,
                                Expression dataFilter,
                                java.lang.Iterable<DataFile> dataFiles,
                                Snapshot parent)Validates that no new delete files that must be applied to the given data files have been added
 to the table since a starting snapshot. | 
| protected void | validateNoNewDeletesForDataFiles(TableMetadata base,
                                java.lang.Long startingSnapshotId,
                                java.lang.Iterable<DataFile> dataFiles)Deprecated. 
 will be removed in 1.3.0; use  MergingSnapshotProducer.validateNoNewDeletesForDataFiles(org.apache.iceberg.TableMetadata, java.lang.Long, java.lang.Iterable<org.apache.iceberg.DataFile>, org.apache.iceberg.Snapshot)instead | 
| protected void | validateNoNewDeletesForDataFiles(TableMetadata base,
                                java.lang.Long startingSnapshotId,
                                java.lang.Iterable<DataFile> dataFiles,
                                Snapshot parent)Validates that no new delete files that must be applied to the given data files have been added
 to the table since a starting snapshot. | 
| protected java.util.concurrent.ExecutorService | workerPool() | 
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitdeleteWith, scanManifestsWith, set, stageOnlyapply, commit, updateEventprotected ReplacePartitions self()
protected java.lang.String operation()
public ReplacePartitions addFile(DataFile file)
ReplacePartitionsDataFile to the table.addFile in interface ReplacePartitionsfile - a data filepublic ReplacePartitions validateAppendOnly()
ReplacePartitionsvalidateAppendOnly in interface ReplacePartitionspublic ReplacePartitions validateFromSnapshot(long newStartingSnapshotId)
ReplacePartitionsAll validations will check changes after this snapshot ID. If this is not called, validation will occur from the beginning of the table's history.
This method should be called before this operation is committed. If a concurrent operation committed a data or delta file or removed a data file after the given snapshot ID that might contain rows matching a partition marked for deletion, validation will detect this and fail.
validateFromSnapshot in interface ReplacePartitionsnewStartingSnapshotId - a snapshot ID, it should be set to when this operation started to read the
     table.public ReplacePartitions validateNoConflictingDeletes()
ReplacePartitionsValidating concurrent deletes is required during non-idempotent replace partition operations. This will check if a concurrent operation deletes data in any of the partitions being overwritten, as the replace partition must be aborted to avoid undeleting rows that were removed concurrently.
validateNoConflictingDeletes in interface ReplacePartitionspublic ReplacePartitions validateNoConflictingData()
ReplacePartitionsValidating concurrent data files is required during non-idempotent replace partition operations. This will check if a concurrent operation inserts data in any of the partitions being overwritten, as the replace partition must be aborted to avoid removing rows added concurrently.
validateNoConflictingData in interface ReplacePartitionspublic BaseReplacePartitions toBranch(java.lang.String branch)
SnapshotUpdatetoBranch in interface SnapshotUpdate<ReplacePartitions>branch - which is name of SnapshotRef of type branch.public void validate(TableMetadata currentMetadata, Snapshot parent)
Child operations can override this to add custom validation.
currentMetadata - current table metadata to validateparent - ending snapshot on the lineage which is being validatedpublic java.util.List<ManifestFile> apply(TableMetadata base, Snapshot snapshot)
base - the base table metadata to apply changes tosnapshot - snapshot to apply the changes topublic ThisT set(java.lang.String property,
                 java.lang.String value)
SnapshotUpdateproperty - a String property namevalue - a String property valuepublic ThisT caseSensitive(boolean isCaseSensitive)
protected boolean isCaseSensitive()
protected PartitionSpec dataSpec()
protected Expression rowFilter()
protected java.util.List<DataFile> addedFiles()
protected void failAnyDelete()
protected void failMissingDeletePaths()
protected void deleteByRowFilter(Expression expr)
expr - an expression to match rows.protected void dropPartition(int specId,
                             StructLike partition)
protected void delete(DataFile file)
protected void delete(DeleteFile file)
protected void delete(java.lang.CharSequence path)
protected void add(DataFile file)
protected void add(DeleteFile file)
protected void add(ManifestFile manifest)
@Deprecated protected void validateAddedDataFiles(TableMetadata base, java.lang.Long startingSnapshotId, PartitionSet partitionSet)
MergingSnapshotProducer.validateAddedDataFiles(TableMetadata, Long, PartitionSet,
     Snapshot) insteadbase - table metadata to validatestartingSnapshotId - id of the snapshot current at the start of the operationpartitionSet - a set of partitions to filter new conflicting data filesprotected void validateAddedDataFiles(TableMetadata base, java.lang.Long startingSnapshotId, PartitionSet partitionSet, Snapshot parent)
base - table metadata to validatestartingSnapshotId - id of the snapshot current at the start of the operationpartitionSet - a set of partitions to filter new conflicting data filesparent - ending snapshot on the lineage being validated@Deprecated protected void validateAddedDataFiles(TableMetadata base, java.lang.Long startingSnapshotId, Expression conflictDetectionFilter)
MergingSnapshotProducer.validateAddedDataFiles(TableMetadata, Long, Expression, Snapshot)
     insteadbase - table metadata to validatestartingSnapshotId - id of the snapshot current at the start of the operationconflictDetectionFilter - an expression used to find new conflicting data filesprotected void validateAddedDataFiles(TableMetadata base, java.lang.Long startingSnapshotId, Expression conflictDetectionFilter, Snapshot parent)
base - table metadata to validatestartingSnapshotId - id of the snapshot current at the start of the operationconflictDetectionFilter - an expression used to find new conflicting data filesprotected void validateNoNewDeletesForDataFiles(TableMetadata base, java.lang.Long startingSnapshotId, java.lang.Iterable<DataFile> dataFiles, Snapshot parent)
base - table metadata to validatestartingSnapshotId - id of the snapshot current at the start of the operationdataFiles - data files to validate have no new row deletesparent - ending snapshot on the branch being validated@Deprecated protected void validateNoNewDeletesForDataFiles(TableMetadata base, java.lang.Long startingSnapshotId, java.lang.Iterable<DataFile> dataFiles)
MergingSnapshotProducer.validateNoNewDeletesForDataFiles(org.apache.iceberg.TableMetadata, java.lang.Long, java.lang.Iterable<org.apache.iceberg.DataFile>, org.apache.iceberg.Snapshot) insteadbase - table metadata to validatestartingSnapshotId - id of the snapshot current at the start of the operationdataFiles - data files to validate have no new row deletes@Deprecated protected void validateNoNewDeletesForDataFiles(TableMetadata base, java.lang.Long startingSnapshotId, Expression dataFilter, java.lang.Iterable<DataFile> dataFiles)
MergingSnapshotProducer.validateNoNewDeletesForDataFiles(org.apache.iceberg.TableMetadata, java.lang.Long, java.lang.Iterable<org.apache.iceberg.DataFile>, org.apache.iceberg.Snapshot) insteadbase - table metadata to validatestartingSnapshotId - id of the snapshot current at the start of the operationdataFilter - a data filterdataFiles - data files to validate have no new row deletesprotected void validateNoNewDeletesForDataFiles(TableMetadata base, java.lang.Long startingSnapshotId, Expression dataFilter, java.lang.Iterable<DataFile> dataFiles, Snapshot parent)
base - table metadata to validatestartingSnapshotId - id of the snapshot current at the start of the operationdataFilter - a data filterdataFiles - data files to validate have no new row deletesparent - ending snapshot on the branch being validated@Deprecated protected void validateNoNewDeleteFiles(TableMetadata base, java.lang.Long startingSnapshotId, Expression dataFilter)
MergingSnapshotProducer.validateNoNewDeleteFiles(org.apache.iceberg.TableMetadata, java.lang.Long, org.apache.iceberg.expressions.Expression) insteadbase - table metadata to validatestartingSnapshotId - id of the snapshot current at the start of the operationdataFilter - an expression used to find new conflicting delete filesprotected void validateNoNewDeleteFiles(TableMetadata base, java.lang.Long startingSnapshotId, Expression dataFilter, Snapshot parent)
base - table metadata to validatestartingSnapshotId - id of the snapshot current at the start of the operationdataFilter - an expression used to find new conflicting delete filesparent - ending snapshot on the branch being validated@Deprecated protected void validateNoNewDeleteFiles(TableMetadata base, java.lang.Long startingSnapshotId, PartitionSet partitionSet)
MergingSnapshotProducer.validateNoNewDeleteFiles(TableMetadata, Long, PartitionSet,
     Snapshot) insteadbase - table metadata to validatestartingSnapshotId - id of the snapshot current at the start of the operationpartitionSet - a partition set used to find new conflicting delete filesprotected void validateNoNewDeleteFiles(TableMetadata base, java.lang.Long startingSnapshotId, PartitionSet partitionSet, Snapshot parent)
base - table metadata to validatestartingSnapshotId - id of the snapshot current at the start of the operationpartitionSet - a partition set used to find new conflicting delete filesparent - ending snapshot on the branch being validated@Deprecated protected org.apache.iceberg.DeleteFileIndex addedDeleteFiles(TableMetadata base, java.lang.Long startingSnapshotId, Expression dataFilter, PartitionSet partitionSet)
MergingSnapshotProducer.addedDeleteFiles(TableMetadata, Long, Expression, PartitionSet,
     Snapshot) insteadbase - table metadata to validatestartingSnapshotId - id of the snapshot current at the start of the operationdataFilter - an expression used to find delete filespartitionSet - a partition set used to find delete filesprotected org.apache.iceberg.DeleteFileIndex addedDeleteFiles(TableMetadata base, java.lang.Long startingSnapshotId, Expression dataFilter, PartitionSet partitionSet, Snapshot parent)
base - table metadata to validatestartingSnapshotId - id of the snapshot current at the start of the operationdataFilter - an expression used to find delete filespartitionSet - a partition set used to find delete filesparent - parent snapshot of the branch@Deprecated protected void validateDeletedDataFiles(TableMetadata base, java.lang.Long startingSnapshotId, Expression dataFilter)
MergingSnapshotProducer.validateDeletedDataFiles(TableMetadata, Long, Expression,
     Snapshot) insteadbase - table metadata to validatestartingSnapshotId - id of the snapshot current at the start of the operationdataFilter - an expression used to find deleted data filesprotected void validateDeletedDataFiles(TableMetadata base, java.lang.Long startingSnapshotId, Expression dataFilter, Snapshot parent)
base - table metadata to validatestartingSnapshotId - id of the snapshot current at the start of the operationdataFilter - an expression used to find deleted data filesparent - ending snapshot on the branch being validated@Deprecated protected void validateDeletedDataFiles(TableMetadata base, java.lang.Long startingSnapshotId, PartitionSet partitionSet)
MergingSnapshotProducer.validateNoNewDeleteFiles(TableMetadata, Long, PartitionSet,
     Snapshot) insteadbase - table metadata to validatestartingSnapshotId - id of the snapshot current at the start of the operationpartitionSet - a partition set used to find deleted data filesprotected void validateDeletedDataFiles(TableMetadata base, java.lang.Long startingSnapshotId, PartitionSet partitionSet, Snapshot parent)
base - table metadata to validatestartingSnapshotId - id of the snapshot current at the start of the operationpartitionSet - a partition set used to find deleted data filesparent - ending snapshot on the branch being validatedprotected void setNewFilesSequenceNumber(long sequenceNumber)
@Deprecated protected void validateDataFilesExist(TableMetadata base, java.lang.Long startingSnapshotId, CharSequenceSet requiredDataFiles, boolean skipDeletes, Expression conflictDetectionFilter)
protected void validateDataFilesExist(TableMetadata base, java.lang.Long startingSnapshotId, CharSequenceSet requiredDataFiles, boolean skipDeletes, Expression conflictDetectionFilter, Snapshot parent)
protected java.util.Map<java.lang.String,java.lang.String> summary()
public java.lang.Object updateEvent()
PendingUpdateprotected void cleanUncommitted(java.util.Set<ManifestFile> committed)
Manifests may not be committed if apply is called more because a commit conflict has occurred. Implementations may keep around manifests because the same changes will be made by both apply calls. This method instructs the implementation to clean up those manifests and passes the paths of the manifests that were actually committed.
committed - a set of manifest paths that were actually committedpublic ThisT stageOnly()
SnapshotUpdatestageOnly in interface SnapshotUpdate<ThisT>public ThisT scanManifestsWith(java.util.concurrent.ExecutorService executorService)
SnapshotUpdatescanManifestsWith in interface SnapshotUpdate<ThisT>executorService - the provided executorprotected CommitMetrics commitMetrics()
protected ThisT reportWith(MetricsReporter newReporter)
protected void targetBranch(java.lang.String branch)
branch - to set as target branchprotected java.lang.String targetBranch()
protected java.util.concurrent.ExecutorService workerPool()
public ThisT deleteWith(java.util.function.Consumer<java.lang.String> deleteCallback)
SnapshotUpdatedeleteWith in interface SnapshotUpdate<ThisT>deleteCallback - a String consumer used to delete locations.public Snapshot apply()
PendingUpdateThis does not result in a permanent update.
apply in interface PendingUpdate<Snapshot>PendingUpdate.commit()protected TableMetadata current()
protected TableMetadata refresh()
public void commit()
PendingUpdateChanges are committed by calling the underlying table's commit method.
Once the commit is successful, the updated table will be refreshed.
commit in interface PendingUpdate<Snapshot>protected void cleanAll()
protected void deleteFile(java.lang.String path)
protected OutputFile manifestListPath()
protected OutputFile newManifestOutput()
protected ManifestWriter<DataFile> newManifestWriter(PartitionSpec spec)
protected ManifestWriter<DeleteFile> newDeleteManifestWriter(PartitionSpec spec)
protected ManifestReader<DataFile> newManifestReader(ManifestFile manifest)
protected ManifestReader<DeleteFile> newDeleteManifestReader(ManifestFile manifest)
protected long snapshotId()