Class StreamingDelete
- java.lang.Object
- 
- org.apache.iceberg.StreamingDelete
 
- 
- All Implemented Interfaces:
- DeleteFiles,- PendingUpdate<Snapshot>,- SnapshotUpdate<DeleteFiles>
 
 public class StreamingDelete extends java.lang.Object implements DeleteFiles Deleteimplementation that avoids loading full manifests in memory.This implementation will attempt to commit 5 times before throwing CommitFailedException.
- 
- 
Method SummaryAll Methods Instance Methods Concrete Methods Modifier and Type Method Description protected voidadd(DataFile file)Add a data file to the new snapshot.protected voidadd(DeleteFile file)Add a delete file to the new snapshot.protected voidadd(ManifestFile manifest)Add all files in a manifest to the new snapshot.protected org.apache.iceberg.DeleteFileIndexaddedDeleteFiles(TableMetadata base, java.lang.Long startingSnapshotId, Expression dataFilter, PartitionSet partitionSet)Returns matching delete files have been added to the table since a starting snapshot.protected java.util.List<DataFile>addedFiles()Snapshotapply()Apply the pending changes and return the uncommitted changes for validation.java.util.List<ManifestFile>apply(TableMetadata base)Apply the update's changes to the base table metadata and return the new manifest list.ThisTcaseSensitive(boolean isCaseSensitive)protected voidcleanAll()protected voidcleanUncommitted(java.util.Set<ManifestFile> committed)Clean up any uncommitted manifests that were created.voidcommit()Apply the pending changes and commit.protected TableMetadatacurrent()protected PartitionSpecdataSpec()protected voiddelete(java.lang.CharSequence path)Add a specific data path to be deleted in the new snapshot.protected voiddelete(DataFile file)Add a specific data file to be deleted in the new snapshot.protected voiddelete(DeleteFile file)Add a specific delete file to be deleted in the new snapshot.protected voiddeleteByRowFilter(Expression expr)Add a filter to match files to delete.StreamingDeletedeleteFile(java.lang.CharSequence path)Delete a file path from the underlying table.protected voiddeleteFile(java.lang.String path)StreamingDeletedeleteFile(DataFile file)Delete a file tracked by aDataFilefrom the underlying table.StreamingDeletedeleteFromRowFilter(Expression expr)Delete files that match anExpressionon data rows from the table.ThisTdeleteWith(java.util.function.Consumer<java.lang.String> deleteCallback)Set a callback to delete files instead of the table's default.protected voiddropPartition(int specId, StructLike partition)Add a partition tuple to drop from the table during the delete phase.protected voidfailAnyDelete()protected voidfailMissingDeletePaths()protected booleanisCaseSensitive()protected OutputFilemanifestListPath()protected ManifestReader<DeleteFile>newDeleteManifestReader(ManifestFile manifest)protected ManifestWriter<DeleteFile>newDeleteManifestWriter(PartitionSpec spec)protected OutputFilenewManifestOutput()protected ManifestReader<DataFile>newManifestReader(ManifestFile manifest)protected ManifestWriter<DataFile>newManifestWriter(PartitionSpec spec)protected java.lang.Stringoperation()A string that describes the action that produced the new snapshot.protected TableMetadatarefresh()protected ExpressionrowFilter()ThisTscanManifestsWith(java.util.concurrent.ExecutorService executorService)Use a particular executor to scan manifests.protected DeleteFilesself()ThisTset(java.lang.String property, java.lang.String value)Set a summary property in the snapshot produced by this update.protected voidsetNewFilesSequenceNumber(long sequenceNumber)protected longsnapshotId()ThisTstageOnly()Called to stage a snapshot in table metadata, but not update the current snapshot id.protected java.util.Map<java.lang.String,java.lang.String>summary()java.lang.ObjectupdateEvent()Generates update event to notify about metadata changesprotected voidvalidate(TableMetadata currentMetadata)Validate the current metadata.protected voidvalidateAddedDataFiles(TableMetadata base, java.lang.Long startingSnapshotId, Expression conflictDetectionFilter)Validates that no files matching a filter have been added to the table since a starting snapshot.protected voidvalidateAddedDataFiles(TableMetadata base, java.lang.Long startingSnapshotId, PartitionSet partitionSet)Validates that no files matching given partitions have been added to the table since a starting snapshot.protected voidvalidateDataFilesExist(TableMetadata base, java.lang.Long startingSnapshotId, CharSequenceSet requiredDataFiles, boolean skipDeletes, Expression conflictDetectionFilter)protected voidvalidateDeletedDataFiles(TableMetadata base, java.lang.Long startingSnapshotId, Expression dataFilter)Validates that no files matching a filter have been deleted from the table since a starting snapshot.protected voidvalidateDeletedDataFiles(TableMetadata base, java.lang.Long startingSnapshotId, PartitionSet partitionSet)Validates that no files matching a filter have been deleted from the table since a starting snapshot.protected voidvalidateNoNewDeleteFiles(TableMetadata base, java.lang.Long startingSnapshotId, Expression dataFilter)Validates that no delete files matching a filter have been added to the table since a starting snapshot.protected voidvalidateNoNewDeleteFiles(TableMetadata base, java.lang.Long startingSnapshotId, PartitionSet partitionSet)Validates that no delete files matching a partition set have been added to the table since a starting snapshot.protected voidvalidateNoNewDeletesForDataFiles(TableMetadata base, java.lang.Long startingSnapshotId, java.lang.Iterable<DataFile> dataFiles)Validates that no new delete files that must be applied to the given data files have been added to the table since a starting snapshot.protected voidvalidateNoNewDeletesForDataFiles(TableMetadata base, java.lang.Long startingSnapshotId, Expression dataFilter, java.lang.Iterable<DataFile> dataFiles)Validates that no new delete files that must be applied to the given data files have been added to the table since a starting snapshot.protected java.util.concurrent.ExecutorServiceworkerPool()- 
Methods inherited from class java.lang.Objectclone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 - 
Methods inherited from interface org.apache.iceberg.DeleteFilescaseSensitive
 - 
Methods inherited from interface org.apache.iceberg.PendingUpdateapply, commit, updateEvent
 - 
Methods inherited from interface org.apache.iceberg.SnapshotUpdatedeleteWith, scanManifestsWith, set, stageOnly
 
- 
 
- 
- 
- 
Method Detail- 
selfprotected DeleteFiles self() 
 - 
operationprotected java.lang.String operation() A string that describes the action that produced the new snapshot.- Returns:
- a string operation
 
 - 
deleteFilepublic StreamingDelete deleteFile(java.lang.CharSequence path) Description copied from interface:DeleteFilesDelete a file path from the underlying table.To remove a file from the table, this path must equal a path in the table's metadata. Paths that are different but equivalent will not be removed. For example, file:/path/file.avro is equivalent to file:///path/file.avro, but would not remove the latter path from the table. - Specified by:
- deleteFilein interface- DeleteFiles
- Parameters:
- path- a fully-qualified file path to remove from the table
- Returns:
- this for method chaining
 
 - 
deleteFilepublic StreamingDelete deleteFile(DataFile file) Description copied from interface:DeleteFilesDelete a file tracked by aDataFilefrom the underlying table.- Specified by:
- deleteFilein interface- DeleteFiles
- Parameters:
- file- a DataFile to remove from the table
- Returns:
- this for method chaining
 
 - 
deleteFromRowFilterpublic StreamingDelete deleteFromRowFilter(Expression expr) Description copied from interface:DeleteFilesDelete files that match anExpressionon data rows from the table.A file is selected to be deleted by the expression if it could contain any rows that match the expression (candidate files are selected using an inclusive projection). These candidate files are deleted if all of the rows in the file must match the expression (the partition data matches the expression'sProjections.strict(PartitionSpec)strict projection}). This guarantees that files are deleted if and only if all rows in the file must match the expression.Files that may contain some rows that match the expression and some rows that do not will result in a ValidationException.- Specified by:
- deleteFromRowFilterin interface- DeleteFiles
- Parameters:
- expr- an expression on rows in the table
- Returns:
- this for method chaining
 
 - 
setpublic ThisT set(java.lang.String property, java.lang.String value)Description copied from interface:SnapshotUpdateSet a summary property in the snapshot produced by this update.- Parameters:
- property- a String property name
- value- a String property value
- Returns:
- this for method chaining
 
 - 
caseSensitivepublic ThisT caseSensitive(boolean isCaseSensitive) 
 - 
isCaseSensitiveprotected boolean isCaseSensitive() 
 - 
dataSpecprotected PartitionSpec dataSpec() 
 - 
rowFilterprotected Expression rowFilter() 
 - 
addedFilesprotected java.util.List<DataFile> addedFiles() 
 - 
failAnyDeleteprotected void failAnyDelete() 
 - 
failMissingDeletePathsprotected void failMissingDeletePaths() 
 - 
deleteByRowFilterprotected void deleteByRowFilter(Expression expr) Add a filter to match files to delete. A file will be deleted if all of the rows it contains match this or any other filter passed to this method.- Parameters:
- expr- an expression to match rows.
 
 - 
dropPartitionprotected void dropPartition(int specId, StructLike partition)Add a partition tuple to drop from the table during the delete phase.
 - 
deleteprotected void delete(DataFile file) Add a specific data file to be deleted in the new snapshot.
 - 
deleteprotected void delete(DeleteFile file) Add a specific delete file to be deleted in the new snapshot.
 - 
deleteprotected void delete(java.lang.CharSequence path) Add a specific data path to be deleted in the new snapshot.
 - 
addprotected void add(DataFile file) Add a data file to the new snapshot.
 - 
addprotected void add(DeleteFile file) Add a delete file to the new snapshot.
 - 
addprotected void add(ManifestFile manifest) Add all files in a manifest to the new snapshot.
 - 
validateAddedDataFilesprotected void validateAddedDataFiles(TableMetadata base, java.lang.Long startingSnapshotId, PartitionSet partitionSet) Validates that no files matching given partitions have been added to the table since a starting snapshot.- Parameters:
- base- table metadata to validate
- startingSnapshotId- id of the snapshot current at the start of the operation
- partitionSet- a set of partitions to filter new conflicting data files
 
 - 
validateAddedDataFilesprotected void validateAddedDataFiles(TableMetadata base, java.lang.Long startingSnapshotId, Expression conflictDetectionFilter) Validates that no files matching a filter have been added to the table since a starting snapshot.- Parameters:
- base- table metadata to validate
- startingSnapshotId- id of the snapshot current at the start of the operation
- conflictDetectionFilter- an expression used to find new conflicting data files
 
 - 
validateNoNewDeletesForDataFilesprotected void validateNoNewDeletesForDataFiles(TableMetadata base, java.lang.Long startingSnapshotId, java.lang.Iterable<DataFile> dataFiles) Validates that no new delete files that must be applied to the given data files have been added to the table since a starting snapshot.- Parameters:
- base- table metadata to validate
- startingSnapshotId- id of the snapshot current at the start of the operation
- dataFiles- data files to validate have no new row deletes
 
 - 
validateNoNewDeletesForDataFilesprotected void validateNoNewDeletesForDataFiles(TableMetadata base, java.lang.Long startingSnapshotId, Expression dataFilter, java.lang.Iterable<DataFile> dataFiles) Validates that no new delete files that must be applied to the given data files have been added to the table since a starting snapshot.- Parameters:
- base- table metadata to validate
- startingSnapshotId- id of the snapshot current at the start of the operation
- dataFilter- a data filter
- dataFiles- data files to validate have no new row deletes
 
 - 
validateNoNewDeleteFilesprotected void validateNoNewDeleteFiles(TableMetadata base, java.lang.Long startingSnapshotId, Expression dataFilter) Validates that no delete files matching a filter have been added to the table since a starting snapshot.- Parameters:
- base- table metadata to validate
- startingSnapshotId- id of the snapshot current at the start of the operation
- dataFilter- an expression used to find new conflicting delete files
 
 - 
validateNoNewDeleteFilesprotected void validateNoNewDeleteFiles(TableMetadata base, java.lang.Long startingSnapshotId, PartitionSet partitionSet) Validates that no delete files matching a partition set have been added to the table since a starting snapshot.- Parameters:
- base- table metadata to validate
- startingSnapshotId- id of the snapshot current at the start of the operation
- partitionSet- a partition set used to find new conflicting delete files
 
 - 
addedDeleteFilesprotected org.apache.iceberg.DeleteFileIndex addedDeleteFiles(TableMetadata base, java.lang.Long startingSnapshotId, Expression dataFilter, PartitionSet partitionSet) Returns matching delete files have been added to the table since a starting snapshot.- Parameters:
- base- table metadata to validate
- startingSnapshotId- id of the snapshot current at the start of the operation
- dataFilter- an expression used to find delete files
- partitionSet- a partition set used to find delete files
 
 - 
validateDeletedDataFilesprotected void validateDeletedDataFiles(TableMetadata base, java.lang.Long startingSnapshotId, Expression dataFilter) Validates that no files matching a filter have been deleted from the table since a starting snapshot.- Parameters:
- base- table metadata to validate
- startingSnapshotId- id of the snapshot current at the start of the operation
- dataFilter- an expression used to find deleted data files
 
 - 
validateDeletedDataFilesprotected void validateDeletedDataFiles(TableMetadata base, java.lang.Long startingSnapshotId, PartitionSet partitionSet) Validates that no files matching a filter have been deleted from the table since a starting snapshot.- Parameters:
- base- table metadata to validate
- startingSnapshotId- id of the snapshot current at the start of the operation
- partitionSet- a partition set used to find deleted data files
 
 - 
setNewFilesSequenceNumberprotected void setNewFilesSequenceNumber(long sequenceNumber) 
 - 
validateDataFilesExistprotected void validateDataFilesExist(TableMetadata base, java.lang.Long startingSnapshotId, CharSequenceSet requiredDataFiles, boolean skipDeletes, Expression conflictDetectionFilter) 
 - 
summaryprotected java.util.Map<java.lang.String,java.lang.String> summary() 
 - 
applypublic java.util.List<ManifestFile> apply(TableMetadata base) Apply the update's changes to the base table metadata and return the new manifest list.- Parameters:
- base- the base table metadata to apply changes to
- Returns:
- a manifest list for the new snapshot.
 
 - 
updateEventpublic java.lang.Object updateEvent() Description copied from interface:PendingUpdateGenerates update event to notify about metadata changes- Returns:
- the generated event
 
 - 
cleanUncommittedprotected void cleanUncommitted(java.util.Set<ManifestFile> committed) Clean up any uncommitted manifests that were created.Manifests may not be committed if apply is called more because a commit conflict has occurred. Implementations may keep around manifests because the same changes will be made by both apply calls. This method instructs the implementation to clean up those manifests and passes the paths of the manifests that were actually committed. - Parameters:
- committed- a set of manifest paths that were actually committed
 
 - 
stageOnlypublic ThisT stageOnly() Description copied from interface:SnapshotUpdateCalled to stage a snapshot in table metadata, but not update the current snapshot id.- Specified by:
- stageOnlyin interface- SnapshotUpdate<ThisT>
- Returns:
- this for method chaining
 
 - 
scanManifestsWithpublic ThisT scanManifestsWith(java.util.concurrent.ExecutorService executorService) Description copied from interface:SnapshotUpdateUse a particular executor to scan manifests. The default worker pool will be used by default.- Specified by:
- scanManifestsWithin interface- SnapshotUpdate<ThisT>
- Parameters:
- executorService- the provided executor
- Returns:
- this for method chaining
 
 - 
workerPoolprotected java.util.concurrent.ExecutorService workerPool() 
 - 
deleteWithpublic ThisT deleteWith(java.util.function.Consumer<java.lang.String> deleteCallback) Description copied from interface:SnapshotUpdateSet a callback to delete files instead of the table's default.- Specified by:
- deleteWithin interface- SnapshotUpdate<ThisT>
- Parameters:
- deleteCallback- a String consumer used to delete locations.
- Returns:
- this for method chaining
 
 - 
validateprotected void validate(TableMetadata currentMetadata) Validate the current metadata.Child operations can override this to add custom validation. - Parameters:
- currentMetadata- current table metadata to validate
 
 - 
applypublic Snapshot apply() Description copied from interface:PendingUpdateApply the pending changes and return the uncommitted changes for validation.This does not result in a permanent update. - Specified by:
- applyin interface- PendingUpdate<ThisT>
- Returns:
- the uncommitted changes that would be committed by calling PendingUpdate.commit()
 
 - 
currentprotected TableMetadata current() 
 - 
refreshprotected TableMetadata refresh() 
 - 
commitpublic void commit() Description copied from interface:PendingUpdateApply the pending changes and commit.Changes are committed by calling the underlying table's commit method. Once the commit is successful, the updated table will be refreshed. - Specified by:
- commitin interface- PendingUpdate<ThisT>
 
 - 
cleanAllprotected void cleanAll() 
 - 
deleteFileprotected void deleteFile(java.lang.String path) 
 - 
manifestListPathprotected OutputFile manifestListPath() 
 - 
newManifestOutputprotected OutputFile newManifestOutput() 
 - 
newManifestWriterprotected ManifestWriter<DataFile> newManifestWriter(PartitionSpec spec) 
 - 
newDeleteManifestWriterprotected ManifestWriter<DeleteFile> newDeleteManifestWriter(PartitionSpec spec) 
 - 
newManifestReaderprotected ManifestReader<DataFile> newManifestReader(ManifestFile manifest) 
 - 
newDeleteManifestReaderprotected ManifestReader<DeleteFile> newDeleteManifestReader(ManifestFile manifest) 
 - 
snapshotIdprotected long snapshotId() 
 
- 
 
-