Interface ContentScanTask<F extends ContentFile<F>>

Type Parameters:
F - the Java class of the content file
All Superinterfaces:
PartitionScanTask, ScanTask, Serializable
All Known Subinterfaces:
AddedRowsScanTask, DataTask, DeletedDataFileScanTask, DeletedRowsScanTask, FileScanTask, PositionDeletesScanTask
All Known Implementing Classes:
BaseFileScanTask

public interface ContentScanTask<F extends ContentFile<F>> extends ScanTask, PartitionScanTask
A scan task over a range of bytes in a content file.
  • Method Details

    • file

      F file()
      The file to scan.
      Returns:
      the file to scan
    • partition

      default StructLike partition()
      Description copied from interface: PartitionScanTask
      Returns the value of the partition for this scan task
      Specified by:
      partition in interface PartitionScanTask
    • sizeBytes

      default long sizeBytes()
      Description copied from interface: ScanTask
      The number of bytes that should be read by this scan task.
      Specified by:
      sizeBytes in interface ScanTask
      Returns:
      the total number of bytes to read
    • start

      long start()
      The starting position of this scan range in the file.
      Returns:
      the start position of this scan range
    • length

      long length()
      The number of bytes to scan from the start() position in the file.
      Returns:
      the length of this scan range in bytes
    • residual

      Expression residual()
      Returns the residual expression that should be applied to rows in this file scan.

      The residual expression for a file is a filter expression created by partially evaluating the scan's filter using the file's partition data.

      Returns:
      a residual expression to apply to rows from this scan
    • estimatedRowsCount

      default long estimatedRowsCount()
      Description copied from interface: ScanTask
      The estimated number of rows produced by this scan task.
      Specified by:
      estimatedRowsCount in interface ScanTask
      Returns:
      the estimated number of produced rows