Interface DeletedRowsScanTask

  • All Superinterfaces:
    ChangelogScanTask, ContentScanTask<DataFile>, PartitionScanTask, ScanTask, java.io.Serializable

    public interface DeletedRowsScanTask
    extends ChangelogScanTask, ContentScanTask<DataFile>
    A scan task for deletes generated by adding delete files to the table.

    Suppose snapshot S1 contains data files F1, F2, F3. Then snapshot S2 adds a position delete file, D1, that deletes records from F2 and snapshot S3 adds an equality delete file, D2, that removes records from F1, F2, F3. A scan for changes from S2 to S3 (inclusive) should include the following tasks:

    • DeletedRowsScanTask(file=F2, added-deletes=[D1], existing-deletes=[], snapshot=S2)
    • DeletedRowsScanTask(file=F1, added-deletes=[D2], existing-deletes=[], snapshot=S3)
    • DeletedRowsScanTask(file=F2, added-deletes=[D2], existing-deletes=[D1], snapshot=S3)
    • DeletedRowsScanTask(file=F3, added-deletes=[D2], existing-deletes=[], snapshot=S3)

    Readers consuming these tasks should produce deleted records with metadata like change ordinal and commit snapshot ID.

    • Method Detail

      • addedDeletes

        java.util.List<DeleteFile> addedDeletes()
        A list of added delete files that apply to the task's data file. Records removed by these delete files should appear as deletes in the changelog.
        Returns:
        a list of added delete files
      • existingDeletes

        java.util.List<DeleteFile> existingDeletes()
        A list of delete files that existed before and must be applied prior to determining which records are deleted by delete files in addedDeletes(). Records removed by these delete files should not appear in the changelog.
        Returns:
        a list of existing delete files
      • sizeBytes

        default long sizeBytes()
        Description copied from interface: ScanTask
        The number of bytes that should be read by this scan task.
        Specified by:
        sizeBytes in interface ContentScanTask<DataFile>
        Specified by:
        sizeBytes in interface ScanTask
        Returns:
        the total number of bytes to read
      • filesCount

        default int filesCount()
        Description copied from interface: ScanTask
        The number of files that will be opened by this scan task.
        Specified by:
        filesCount in interface ScanTask
        Returns:
        the number of files to open