Interface AddedRowsScanTask

  • All Superinterfaces:
    ChangelogScanTask, ContentScanTask<DataFile>, PartitionScanTask, ScanTask, java.io.Serializable

    public interface AddedRowsScanTask
    extends ChangelogScanTask, ContentScanTask<DataFile>
    A scan task for inserts generated by adding a data file to the table.

    Note that added data files may have matching delete files. This may happen if a matching position delete file is committed in the same snapshot or if changes for multiple snapshots are squashed together.

    Suppose snapshot S1 adds data files F1, F2, F3 and a position delete file, D1, that marks particular records in F1 as deleted. A scan for changes generated by S1 should include the following tasks:

    • AddedRowsScanTask(file=F1, deletes=[D1], snapshot=S1)
    • AddedRowsScanTask(file=F2, deletes=[], snapshot=S1)
    • AddedRowsScanTask(file=F3, deletes=[], snapshot=S1)

    Readers consuming these tasks should produce added records with metadata like change ordinal and commit snapshot ID.

    • Method Detail

      • deletes

        java.util.List<DeleteFile> deletes()
        A list of delete files to apply when reading the data file in this task.
        Returns:
        a list of delete files to apply
      • sizeBytes

        default long sizeBytes()
        Description copied from interface: ScanTask
        The number of bytes that should be read by this scan task.
        Specified by:
        sizeBytes in interface ContentScanTask<DataFile>
        Specified by:
        sizeBytes in interface ScanTask
        Returns:
        the total number of bytes to read
      • filesCount

        default int filesCount()
        Description copied from interface: ScanTask
        The number of files that will be opened by this scan task.
        Specified by:
        filesCount in interface ScanTask
        Returns:
        the number of files to open