Package org.apache.iceberg
Interface DeletedRowsScanTask
-
- All Superinterfaces:
ChangelogScanTask
,ContentScanTask<DataFile>
,PartitionScanTask
,ScanTask
,java.io.Serializable
public interface DeletedRowsScanTask extends ChangelogScanTask, ContentScanTask<DataFile>
A scan task for deletes generated by adding delete files to the table.Suppose snapshot S1 contains data files F1, F2, F3. Then snapshot S2 adds a position delete file, D1, that deletes records from F2 and snapshot S3 adds an equality delete file, D2, that removes records from F1, F2, F3. A scan for changes from S2 to S3 (inclusive) should include the following tasks:
- DeletedRowsScanTask(file=F2, added-deletes=[D1], existing-deletes=[], snapshot=S2)
- DeletedRowsScanTask(file=F1, added-deletes=[D2], existing-deletes=[], snapshot=S3)
- DeletedRowsScanTask(file=F2, added-deletes=[D2], existing-deletes=[D1], snapshot=S3)
- DeletedRowsScanTask(file=F3, added-deletes=[D2], existing-deletes=[], snapshot=S3)
Readers consuming these tasks should produce deleted records with metadata like change ordinal and commit snapshot ID.
-
-
Method Summary
All Methods Instance Methods Abstract Methods Default Methods Modifier and Type Method Description java.util.List<DeleteFile>
addedDeletes()
A list of addeddelete files
that apply to the task's data file.java.util.List<DeleteFile>
existingDeletes()
A list ofdelete files
that existed before and must be applied prior to determining which records are deleted by delete files inaddedDeletes()
.default int
filesCount()
The number of files that will be opened by this scan task.default ChangelogOperation
operation()
Returns the type of changes produced by this task (i.e.default long
sizeBytes()
The number of bytes that should be read by this scan task.-
Methods inherited from interface org.apache.iceberg.ChangelogScanTask
changeOrdinal, commitSnapshotId
-
Methods inherited from interface org.apache.iceberg.ContentScanTask
estimatedRowsCount, file, length, partition, residual, start
-
Methods inherited from interface org.apache.iceberg.PartitionScanTask
spec
-
Methods inherited from interface org.apache.iceberg.ScanTask
asCombinedScanTask, asDataTask, asFileScanTask, isDataTask, isFileScanTask
-
-
-
-
Method Detail
-
addedDeletes
java.util.List<DeleteFile> addedDeletes()
A list of addeddelete files
that apply to the task's data file. Records removed by these delete files should appear as deletes in the changelog.- Returns:
- a list of added delete files
-
existingDeletes
java.util.List<DeleteFile> existingDeletes()
A list ofdelete files
that existed before and must be applied prior to determining which records are deleted by delete files inaddedDeletes()
. Records removed by these delete files should not appear in the changelog.- Returns:
- a list of existing delete files
-
operation
default ChangelogOperation operation()
Description copied from interface:ChangelogScanTask
Returns the type of changes produced by this task (i.e. insert/delete).- Specified by:
operation
in interfaceChangelogScanTask
-
sizeBytes
default long sizeBytes()
Description copied from interface:ScanTask
The number of bytes that should be read by this scan task.- Specified by:
sizeBytes
in interfaceContentScanTask<DataFile>
- Specified by:
sizeBytes
in interfaceScanTask
- Returns:
- the total number of bytes to read
-
filesCount
default int filesCount()
Description copied from interface:ScanTask
The number of files that will be opened by this scan task.- Specified by:
filesCount
in interfaceScanTask
- Returns:
- the number of files to open
-
-