Interface ScanTaskGroup<T extends ScanTask>

Type Parameters:
T - the type of scan tasks
All Superinterfaces:
ScanTask, Serializable
All Known Subinterfaces:
CombinedScanTask
All Known Implementing Classes:
BaseCombinedScanTask, BaseScanTaskGroup

public interface ScanTaskGroup<T extends ScanTask> extends ScanTask
A scan task that may include partial input files, multiple input files or both.
  • Method Details

    • groupingKey

      default StructLike groupingKey()
      Returns a grouping key for this task group.

      A grouping key is a set of values that are common amongst all rows produced by the tasks in this task group. The values may be the result of transforming the underlying data. For example, a grouping key can consist of a bucket ordinal computed by applying a bucket transform to a column of the underlying rows. The grouping key type is determined at planning time and is identical across all task groups produced by a scan.

      Implementations should return an empty struct if the data grouping is random or unknown.

      Returns:
      a grouping key for this task group
    • tasks

      Collection<T> tasks()
      Returns scan tasks in this group.
    • sizeBytes

      default long sizeBytes()
      Description copied from interface: ScanTask
      The number of bytes that should be read by this scan task.
      Specified by:
      sizeBytes in interface ScanTask
      Returns:
      the total number of bytes to read
    • estimatedRowsCount

      default long estimatedRowsCount()
      Description copied from interface: ScanTask
      The estimated number of rows produced by this scan task.
      Specified by:
      estimatedRowsCount in interface ScanTask
      Returns:
      the estimated number of produced rows
    • filesCount

      default int filesCount()
      Description copied from interface: ScanTask
      The number of files that will be opened by this scan task.
      Specified by:
      filesCount in interface ScanTask
      Returns:
      the number of files to open