Class BaseOverwriteFiles
- java.lang.Object
-
- org.apache.iceberg.BaseOverwriteFiles
-
- All Implemented Interfaces:
OverwriteFiles
,PendingUpdate<Snapshot>
,SnapshotUpdate<OverwriteFiles>
public class BaseOverwriteFiles extends java.lang.Object implements OverwriteFiles
-
-
Constructor Summary
Constructors Modifier Constructor Description protected
BaseOverwriteFiles(java.lang.String tableName, TableOperations ops)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected void
add(DataFile file)
Add a data file to the new snapshot.protected void
add(DeleteFile file)
Add a delete file to the new snapshot.protected void
add(ManifestFile manifest)
Add all files in a manifest to the new snapshot.protected java.util.List<DataFile>
addedFiles()
OverwriteFiles
addFile(DataFile file)
Add aDataFile
to the table.Snapshot
apply()
Apply the pending changes and return the uncommitted changes for validation.java.util.List<ManifestFile>
apply(TableMetadata base)
Apply the update's changes to the base table metadata and return the new manifest list.protected void
cleanAll()
protected void
cleanUncommitted(java.util.Set<ManifestFile> committed)
Clean up any uncommitted manifests that were created.void
commit()
Apply the pending changes and commit.protected TableMetadata
current()
protected void
delete(java.lang.CharSequence path)
Add a specific data path to be deleted in the new snapshot.protected void
delete(DataFile file)
Add a specific data file to be deleted in the new snapshot.protected void
delete(DeleteFile file)
Add a specific delete file to be deleted in the new snapshot.protected void
deleteByRowFilter(Expression expr)
Add a filter to match files to delete.protected void
deleteFile(java.lang.String path)
OverwriteFiles
deleteFile(DataFile file)
Delete aDataFile
from the table.ThisT
deleteWith(java.util.function.Consumer<java.lang.String> deleteCallback)
Set a callback to delete files instead of the table's default.protected void
dropPartition(StructLike partition)
Add a partition tuple to drop from the table during the delete phase.protected void
failAnyDelete()
protected void
failMissingDeletePaths()
protected OutputFile
manifestListPath()
protected ManifestReader<DeleteFile>
newDeleteManifestReader(ManifestFile manifest)
protected ManifestWriter<DeleteFile>
newDeleteManifestWriter(PartitionSpec spec)
protected OutputFile
newManifestOutput()
protected ManifestReader<DataFile>
newManifestReader(ManifestFile manifest)
protected ManifestWriter<DataFile>
newManifestWriter(PartitionSpec spec)
protected java.lang.String
operation()
A string that describes the action that produced the new snapshot.OverwriteFiles
overwriteByRowFilter(Expression expr)
Delete files that match anExpression
on data rows from the table.protected TableMetadata
refresh()
protected Expression
rowFilter()
protected OverwriteFiles
self()
ThisT
set(java.lang.String property, java.lang.String value)
Set a summary property in the snapshot produced by this update.protected long
snapshotId()
ThisT
stageOnly()
Called to stage a snapshot in table metadata, but not update the current snapshot id.protected java.util.Map<java.lang.String,java.lang.String>
summary()
java.lang.Object
updateEvent()
Generates update event to notify about metadata changesOverwriteFiles
validateAddedFilesMatchOverwriteFilter()
Signal that each file added to the table must match the overwrite expression.OverwriteFiles
validateNoConflictingAppends(java.lang.Long newReadSnapshotId, Expression newConflictDetectionFilter)
Enables validation that files added concurrently do not conflict with this commit's operation.protected PartitionSpec
writeSpec()
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface org.apache.iceberg.PendingUpdate
apply, commit, updateEvent
-
Methods inherited from interface org.apache.iceberg.SnapshotUpdate
deleteWith, set, stageOnly
-
-
-
-
Constructor Detail
-
BaseOverwriteFiles
protected BaseOverwriteFiles(java.lang.String tableName, TableOperations ops)
-
-
Method Detail
-
self
protected OverwriteFiles self()
-
operation
protected java.lang.String operation()
A string that describes the action that produced the new snapshot.- Returns:
- a string operation
-
overwriteByRowFilter
public OverwriteFiles overwriteByRowFilter(Expression expr)
Description copied from interface:OverwriteFiles
Delete files that match anExpression
on data rows from the table.A file is selected to be deleted by the expression if it could contain any rows that match the expression (candidate files are selected using an
inclusive projection
). These candidate files are deleted if all of the rows in the file must match the expression (the partition data matches the expression'sProjections.strict(PartitionSpec)
strict projection}). This guarantees that files are deleted if and only if all rows in the file must match the expression.Files that may contain some rows that match the expression and some rows that do not will result in a
ValidationException
.- Specified by:
overwriteByRowFilter
in interfaceOverwriteFiles
- Parameters:
expr
- an expression on rows in the table- Returns:
- this for method chaining
-
addFile
public OverwriteFiles addFile(DataFile file)
Description copied from interface:OverwriteFiles
Add aDataFile
to the table.- Specified by:
addFile
in interfaceOverwriteFiles
- Parameters:
file
- a data file- Returns:
- this for method chaining
-
deleteFile
public OverwriteFiles deleteFile(DataFile file)
Description copied from interface:OverwriteFiles
Delete aDataFile
from the table.- Specified by:
deleteFile
in interfaceOverwriteFiles
- Parameters:
file
- a data file- Returns:
- this for method chaining
-
validateAddedFilesMatchOverwriteFilter
public OverwriteFiles validateAddedFilesMatchOverwriteFilter()
Description copied from interface:OverwriteFiles
Signal that each file added to the table must match the overwrite expression.If this method is called, each added file is validated on commit to ensure that it matches the overwrite row filter. This is used to ensure that writes are idempotent: that files cannot be added during a commit that would not be removed if the operation were run a second time.
- Specified by:
validateAddedFilesMatchOverwriteFilter
in interfaceOverwriteFiles
- Returns:
- this for method chaining
-
validateNoConflictingAppends
public OverwriteFiles validateNoConflictingAppends(java.lang.Long newReadSnapshotId, Expression newConflictDetectionFilter)
Description copied from interface:OverwriteFiles
Enables validation that files added concurrently do not conflict with this commit's operation.This method should be called when the table is queried to determine which files to delete/append. If a concurrent operation commits a new file after the data was read and that file might contain rows matching the specified conflict detection filter, the overwrite operation will detect this during retries and fail.
Calling this method with a correct conflict detection filter is required to maintain serializable isolation for eager update/delete operations. Otherwise, the isolation level will be snapshot isolation.
- Specified by:
validateNoConflictingAppends
in interfaceOverwriteFiles
- Parameters:
newReadSnapshotId
- the snapshot id that was used to read the data or null if the table was emptynewConflictDetectionFilter
- an expression on rows in the table- Returns:
- this for method chaining
-
apply
public java.util.List<ManifestFile> apply(TableMetadata base)
Apply the update's changes to the base table metadata and return the new manifest list.- Parameters:
base
- the base table metadata to apply changes to- Returns:
- a manifest list for the new snapshot.
-
set
public ThisT set(java.lang.String property, java.lang.String value)
Description copied from interface:SnapshotUpdate
Set a summary property in the snapshot produced by this update.- Parameters:
property
- a String property namevalue
- a String property value- Returns:
- this for method chaining
-
writeSpec
protected PartitionSpec writeSpec()
-
rowFilter
protected Expression rowFilter()
-
addedFiles
protected java.util.List<DataFile> addedFiles()
-
failAnyDelete
protected void failAnyDelete()
-
failMissingDeletePaths
protected void failMissingDeletePaths()
-
deleteByRowFilter
protected void deleteByRowFilter(Expression expr)
Add a filter to match files to delete. A file will be deleted if all of the rows it contains match this or any other filter passed to this method.- Parameters:
expr
- an expression to match rows.
-
dropPartition
protected void dropPartition(StructLike partition)
Add a partition tuple to drop from the table during the delete phase.
-
delete
protected void delete(DataFile file)
Add a specific data file to be deleted in the new snapshot.
-
delete
protected void delete(DeleteFile file)
Add a specific delete file to be deleted in the new snapshot.
-
delete
protected void delete(java.lang.CharSequence path)
Add a specific data path to be deleted in the new snapshot.
-
add
protected void add(DataFile file)
Add a data file to the new snapshot.
-
add
protected void add(DeleteFile file)
Add a delete file to the new snapshot.
-
add
protected void add(ManifestFile manifest)
Add all files in a manifest to the new snapshot.
-
summary
protected java.util.Map<java.lang.String,java.lang.String> summary()
-
updateEvent
public java.lang.Object updateEvent()
Description copied from interface:PendingUpdate
Generates update event to notify about metadata changes- Returns:
- the generated event
-
cleanUncommitted
protected void cleanUncommitted(java.util.Set<ManifestFile> committed)
Clean up any uncommitted manifests that were created.Manifests may not be committed if apply is called more because a commit conflict has occurred. Implementations may keep around manifests because the same changes will be made by both apply calls. This method instructs the implementation to clean up those manifests and passes the paths of the manifests that were actually committed.
- Parameters:
committed
- a set of manifest paths that were actually committed
-
stageOnly
public ThisT stageOnly()
Description copied from interface:SnapshotUpdate
Called to stage a snapshot in table metadata, but not update the current snapshot id.- Specified by:
stageOnly
in interfaceSnapshotUpdate<ThisT>
- Returns:
- this for method chaining
-
deleteWith
public ThisT deleteWith(java.util.function.Consumer<java.lang.String> deleteCallback)
Description copied from interface:SnapshotUpdate
Set a callback to delete files instead of the table's default.- Specified by:
deleteWith
in interfaceSnapshotUpdate<ThisT>
- Parameters:
deleteCallback
- a String consumer used to delete locations.- Returns:
- this for method chaining
-
apply
public Snapshot apply()
Description copied from interface:PendingUpdate
Apply the pending changes and return the uncommitted changes for validation.This does not result in a permanent update.
- Specified by:
apply
in interfacePendingUpdate<ThisT>
- Returns:
- the uncommitted changes that would be committed by calling
PendingUpdate.commit()
-
current
protected TableMetadata current()
-
refresh
protected TableMetadata refresh()
-
commit
public void commit()
Description copied from interface:PendingUpdate
Apply the pending changes and commit.Changes are committed by calling the underlying table's commit method.
Once the commit is successful, the updated table will be refreshed.
- Specified by:
commit
in interfacePendingUpdate<ThisT>
-
cleanAll
protected void cleanAll()
-
deleteFile
protected void deleteFile(java.lang.String path)
-
manifestListPath
protected OutputFile manifestListPath()
-
newManifestOutput
protected OutputFile newManifestOutput()
-
newManifestWriter
protected ManifestWriter<DataFile> newManifestWriter(PartitionSpec spec)
-
newDeleteManifestWriter
protected ManifestWriter<DeleteFile> newDeleteManifestWriter(PartitionSpec spec)
-
newManifestReader
protected ManifestReader<DataFile> newManifestReader(ManifestFile manifest)
-
newDeleteManifestReader
protected ManifestReader<DeleteFile> newDeleteManifestReader(ManifestFile manifest)
-
snapshotId
protected long snapshotId()
-
-