Interface TableScan

    • Method Detail

      • table

        Table table()
        Returns the Table from which this scan loads data.
        Returns:
        this scan's table
      • useSnapshot

        TableScan useSnapshot​(long snapshotId)
        Create a new TableScan from this scan's configuration that will use the given snapshot by ID.
        Parameters:
        snapshotId - a snapshot ID
        Returns:
        a new scan based on this with the given snapshot ID
        Throws:
        java.lang.IllegalArgumentException - if the snapshot cannot be found
      • asOfTime

        TableScan asOfTime​(long timestampMillis)
        Create a new TableScan from this scan's configuration that will use the most recent snapshot as of the given time in milliseconds.
        Parameters:
        timestampMillis - a timestamp in milliseconds.
        Returns:
        a new scan based on this with the current snapshot at the given time
        Throws:
        java.lang.IllegalArgumentException - if the snapshot cannot be found
      • option

        TableScan option​(java.lang.String property,
                         java.lang.String value)
        Create a new TableScan from this scan's configuration that will override the Table's behavior based on the incoming pair. Unknown properties will be ignored.
        Parameters:
        property - name of the table property to be overridden
        value - value to override with
        Returns:
        a new scan based on this with overridden behavior
      • project

        TableScan project​(Schema schema)
        Create a new TableScan from this with the schema as its projection.
        Parameters:
        schema - a projection schema
        Returns:
        a new scan based on this with the given projection
      • caseSensitive

        TableScan caseSensitive​(boolean caseSensitive)
        Create a new TableScan from this that, if data columns where selected via select(java.util.Collection), controls whether the match to the schema will be done with case sensitivity.
        Returns:
        a new scan based on this with case sensitivity as stated
      • includeColumnStats

        TableScan includeColumnStats()
        Create a new TableScan from this that loads the column stats with each data file.

        Column stats include: value count, null value count, lower bounds, and upper bounds.

        Returns:
        a new scan based on this that loads column stats.
      • select

        default TableScan select​(java.lang.String... columns)
        Create a new TableScan from this that will read the given data columns. This produces an expected schema that includes all fields that are either selected or used by this scan's filter expression.
        Parameters:
        columns - column names from the table's schema
        Returns:
        a new scan based on this with the given projection columns
      • select

        TableScan select​(java.util.Collection<java.lang.String> columns)
        Create a new TableScan from this that will read the given data columns. This produces an expected schema that includes all fields that are either selected or used by this scan's filter expression.
        Parameters:
        columns - column names from the table's schema
        Returns:
        a new scan based on this with the given projection columns
      • filter

        TableScan filter​(Expression expr)
        Create a new TableScan from the results of this filtered by the Expression.
        Parameters:
        expr - a filter expression
        Returns:
        a new scan based on this with results filtered by the expression
      • filter

        Expression filter()
        Returns this scan's filter Expression.
        Returns:
        this scan's filter expression
      • ignoreResiduals

        TableScan ignoreResiduals()
        Create a new TableScan from this that applies data filtering to files but not to rows in those files.
        Returns:
        a new scan based on this that does not filter rows in files.
      • appendsBetween

        TableScan appendsBetween​(long fromSnapshotId,
                                 long toSnapshotId)
        Create a new TableScan to read appended data from fromSnapshotId exclusive to toSnapshotId inclusive.
        Parameters:
        fromSnapshotId - the last snapshot id read by the user, exclusive
        toSnapshotId - read append data up to this snapshot id
        Returns:
        a table scan which can read append data from fromSnapshotId exclusive and up to toSnapshotId inclusive
      • appendsAfter

        TableScan appendsAfter​(long fromSnapshotId)
        Create a new TableScan to read appended data from fromSnapshotId exclusive to the current snapshot inclusive.
        Parameters:
        fromSnapshotId - - the last snapshot id read by the user, exclusive
        Returns:
        a table scan which can read append data from fromSnapshotId exclusive and up to current snapshot inclusive
      • planFiles

        CloseableIterable<FileScanTask> planFiles()
        Plan the files that will be read by this scan.

        Each file has a residual expression that should be applied to filter the file's rows.

        This simple plan returns file scans for each file from position 0 to the file's length. For planning that will combine small files, split large files, and attempt to balance work, use planTasks() instead.

        Returns:
        an Iterable of file tasks that are required by this scan
      • planTasks

        CloseableIterable<CombinedScanTask> planTasks()
        Plan the tasks for this scan.

        Tasks created by this method may read partial input files, multiple input files, or both.

        Returns:
        an Iterable of tasks for this scan
      • schema

        Schema schema()
        Returns this scan's projection Schema.

        If the projection schema was set directly using project(Schema), returns that schema.

        If the projection schema was set by calling select(Collection), returns a projection schema that includes the selected data fields and any fields used in the filter expression.

        Returns:
        this scan's projection schema
      • snapshot

        Snapshot snapshot()
        Returns the Snapshot that will be used by this scan.

        If the snapshot was not configured using asOfTime(long) or useSnapshot(long), the current table snapshot will be used.

        Returns:
        the Snapshot this scan will use
      • isCaseSensitive

        boolean isCaseSensitive()
        Returns whether this scan should apply column name case sensitiveness as per caseSensitive(boolean).
        Returns:
        true if case sensitive, false otherwise.
      • targetSplitSize

        long targetSplitSize()
        Returns the target split size for this scan.
      • splitLookback

        int splitLookback()
        Returns the split lookback for this scan.
      • splitOpenFileCost

        long splitOpenFileCost()
        Returns the split open file cost for this scan.