public interface TableScan
TableScan objects are immutable and can be shared between threads. Refinement methods, like
select(Collection)
and filter(Expression)
, create new TableScan instances.
Modifier and Type | Method and Description |
---|---|
TableScan |
appendsAfter(long fromSnapshotId)
Create a new
TableScan to read appended data from fromSnapshotId exclusive to the current snapshot
inclusive. |
TableScan |
appendsBetween(long fromSnapshotId,
long toSnapshotId)
Create a new
TableScan to read appended data from fromSnapshotId exclusive to toSnapshotId
inclusive. |
TableScan |
asOfTime(long timestampMillis)
Create a new
TableScan from this scan's configuration that will use the most recent
snapshot as of the given time in milliseconds. |
TableScan |
caseSensitive(boolean caseSensitive)
Create a new
TableScan from this that, if data columns where selected
via select(java.util.Collection) , controls whether the match to the schema will be done
with case sensitivity. |
Expression |
filter()
Returns this scan's filter
Expression . |
TableScan |
filter(Expression expr)
Create a new
TableScan from the results of this filtered by the Expression . |
TableScan |
ignoreResiduals()
Create a new
TableScan from this that applies data filtering to files but not to rows in those files. |
TableScan |
includeColumnStats()
Create a new
TableScan from this that loads the column stats with each data file. |
boolean |
isCaseSensitive()
Returns whether this scan should apply column name case sensitiveness as per
caseSensitive(boolean) . |
TableScan |
option(java.lang.String property,
java.lang.String value)
|
CloseableIterable<FileScanTask> |
planFiles()
Plan the
files that will be read by this scan. |
CloseableIterable<CombinedScanTask> |
planTasks()
Plan the
tasks for this scan. |
TableScan |
project(Schema schema)
Create a new
TableScan from this with the schema as its projection. |
Schema |
schema()
Returns this scan's projection
Schema . |
TableScan |
select(java.util.Collection<java.lang.String> columns)
Create a new
TableScan from this that will read the given data columns. |
default TableScan |
select(java.lang.String... columns)
Create a new
TableScan from this that will read the given data columns. |
Snapshot |
snapshot()
Returns the
Snapshot that will be used by this scan. |
Table |
table()
Returns the
Table from which this scan loads data. |
TableScan |
useSnapshot(long snapshotId)
Create a new
TableScan from this scan's configuration that will use the given snapshot
by ID. |
TableScan useSnapshot(long snapshotId)
TableScan
from this scan's configuration that will use the given snapshot
by ID.snapshotId
- a snapshot IDjava.lang.IllegalArgumentException
- if the snapshot cannot be foundTableScan asOfTime(long timestampMillis)
TableScan
from this scan's configuration that will use the most recent
snapshot as of the given time in milliseconds.timestampMillis
- a timestamp in milliseconds.java.lang.IllegalArgumentException
- if the snapshot cannot be foundTableScan option(java.lang.String property, java.lang.String value)
TableScan
from this scan's configuration that will override the Table
's behavior based
on the incoming pair. Unknown properties will be ignored.property
- name of the table property to be overriddenvalue
- value to override withTableScan project(Schema schema)
TableScan
from this with the schema as its projection.schema
- a projection schemaTableScan caseSensitive(boolean caseSensitive)
TableScan
from this that, if data columns where selected
via select(java.util.Collection)
, controls whether the match to the schema will be done
with case sensitivity.TableScan includeColumnStats()
TableScan
from this that loads the column stats with each data file.
Column stats include: value count, null value count, lower bounds, and upper bounds.
default TableScan select(java.lang.String... columns)
TableScan
from this that will read the given data columns. This produces
an expected schema that includes all fields that are either selected or used by this scan's
filter expression.columns
- column names from the table's schemaTableScan select(java.util.Collection<java.lang.String> columns)
TableScan
from this that will read the given data columns. This produces
an expected schema that includes all fields that are either selected or used by this scan's
filter expression.columns
- column names from the table's schemaTableScan filter(Expression expr)
TableScan
from the results of this filtered by the Expression
.expr
- a filter expressionExpression filter()
Expression
.TableScan ignoreResiduals()
TableScan
from this that applies data filtering to files but not to rows in those files.TableScan appendsBetween(long fromSnapshotId, long toSnapshotId)
TableScan
to read appended data from fromSnapshotId
exclusive to toSnapshotId
inclusive.fromSnapshotId
- the last snapshot id read by the user, exclusivetoSnapshotId
- read append data up to this snapshot idfromSnapshotId
exclusive and up to toSnapshotId
inclusiveTableScan appendsAfter(long fromSnapshotId)
TableScan
to read appended data from fromSnapshotId
exclusive to the current snapshot
inclusive.fromSnapshotId
- - the last snapshot id read by the user, exclusivefromSnapshotId
exclusive and up to current snapshot inclusiveCloseableIterable<FileScanTask> planFiles()
files
that will be read by this scan.
Each file has a residual expression that should be applied to filter the file's rows.
This simple plan returns file scans for each file from position 0 to the file's length. For
planning that will combine small files, split large files, and attempt to balance work, use
planTasks()
instead.
CloseableIterable<CombinedScanTask> planTasks()
tasks
for this scan.
Tasks created by this method may read partial input files, multiple input files, or both.
Schema schema()
Schema
.
If the projection schema was set directly using project(Schema)
, returns that schema.
If the projection schema was set by calling select(Collection)
, returns a projection
schema that includes the selected data fields and any fields used in the filter expression.
Snapshot snapshot()
Snapshot
that will be used by this scan.
If the snapshot was not configured using asOfTime(long)
or useSnapshot(long)
, the current table
snapshot will be used.
boolean isCaseSensitive()
caseSensitive(boolean)
.