Interface TableScan
-
- All Known Implementing Classes:
AllDataFilesTable.AllDataFilesTableScan,AllManifestsTable.AllManifestsTableScan,DataFilesTable.FilesTableScan,DataTableScan
public interface TableScanAPI for configuring a table scan.TableScan objects are immutable and can be shared between threads. Refinement methods, like
select(Collection)andfilter(Expression), create new TableScan instances.
-
-
Method Summary
All Methods Instance Methods Abstract Methods Default Methods Modifier and Type Method Description TableScanappendsAfter(long fromSnapshotId)Create a newTableScanto read appended data fromfromSnapshotIdexclusive to the current snapshot inclusive.TableScanappendsBetween(long fromSnapshotId, long toSnapshotId)Create a newTableScanto read appended data fromfromSnapshotIdexclusive totoSnapshotIdinclusive.TableScanasOfTime(long timestampMillis)Create a newTableScanfrom this scan's configuration that will use the most recent snapshot as of the given time in milliseconds.TableScancaseSensitive(boolean caseSensitive)Create a newTableScanfrom this that, if data columns where selected viaselect(java.util.Collection), controls whether the match to the schema will be done with case sensitivity.Expressionfilter()Returns this scan's filterExpression.TableScanfilter(Expression expr)Create a newTableScanfrom the results of this filtered by theExpression.TableScanignoreResiduals()Create a newTableScanfrom this that applies data filtering to files but not to rows in those files.TableScanincludeColumnStats()Create a newTableScanfrom this that loads the column stats with each data file.booleanisCaseSensitive()Returns whether this scan should apply column name case sensitiveness as percaseSensitive(boolean).TableScanoption(java.lang.String property, java.lang.String value)CloseableIterable<FileScanTask>planFiles()Plan thefilesthat will be read by this scan.CloseableIterable<CombinedScanTask>planTasks()Plan thetasksfor this scan.TableScanproject(Schema schema)Create a newTableScanfrom this with the schema as its projection.Schemaschema()Returns this scan's projectionSchema.default TableScanselect(java.lang.String... columns)Create a newTableScanfrom this that will read the given data columns.TableScanselect(java.util.Collection<java.lang.String> columns)Create a newTableScanfrom this that will read the given data columns.Snapshotsnapshot()Returns theSnapshotthat will be used by this scan.Tabletable()Returns theTablefrom which this scan loads data.TableScanuseSnapshot(long snapshotId)Create a newTableScanfrom this scan's configuration that will use the given snapshot by ID.
-
-
-
Method Detail
-
useSnapshot
TableScan useSnapshot(long snapshotId)
Create a newTableScanfrom this scan's configuration that will use the given snapshot by ID.- Parameters:
snapshotId- a snapshot ID- Returns:
- a new scan based on this with the given snapshot ID
- Throws:
java.lang.IllegalArgumentException- if the snapshot cannot be found
-
asOfTime
TableScan asOfTime(long timestampMillis)
Create a newTableScanfrom this scan's configuration that will use the most recent snapshot as of the given time in milliseconds.- Parameters:
timestampMillis- a timestamp in milliseconds.- Returns:
- a new scan based on this with the current snapshot at the given time
- Throws:
java.lang.IllegalArgumentException- if the snapshot cannot be found
-
option
TableScan option(java.lang.String property, java.lang.String value)
Create a newTableScanfrom this scan's configuration that will override theTable's behavior based on the incoming pair. Unknown properties will be ignored.- Parameters:
property- name of the table property to be overriddenvalue- value to override with- Returns:
- a new scan based on this with overridden behavior
-
project
TableScan project(Schema schema)
Create a newTableScanfrom this with the schema as its projection.- Parameters:
schema- a projection schema- Returns:
- a new scan based on this with the given projection
-
caseSensitive
TableScan caseSensitive(boolean caseSensitive)
Create a newTableScanfrom this that, if data columns where selected viaselect(java.util.Collection), controls whether the match to the schema will be done with case sensitivity.- Returns:
- a new scan based on this with case sensitivity as stated
-
includeColumnStats
TableScan includeColumnStats()
Create a newTableScanfrom this that loads the column stats with each data file.Column stats include: value count, null value count, lower bounds, and upper bounds.
- Returns:
- a new scan based on this that loads column stats.
-
select
default TableScan select(java.lang.String... columns)
Create a newTableScanfrom this that will read the given data columns. This produces an expected schema that includes all fields that are either selected or used by this scan's filter expression.- Parameters:
columns- column names from the table's schema- Returns:
- a new scan based on this with the given projection columns
-
select
TableScan select(java.util.Collection<java.lang.String> columns)
Create a newTableScanfrom this that will read the given data columns. This produces an expected schema that includes all fields that are either selected or used by this scan's filter expression.- Parameters:
columns- column names from the table's schema- Returns:
- a new scan based on this with the given projection columns
-
filter
TableScan filter(Expression expr)
Create a newTableScanfrom the results of this filtered by theExpression.- Parameters:
expr- a filter expression- Returns:
- a new scan based on this with results filtered by the expression
-
filter
Expression filter()
Returns this scan's filterExpression.- Returns:
- this scan's filter expression
-
ignoreResiduals
TableScan ignoreResiduals()
Create a newTableScanfrom this that applies data filtering to files but not to rows in those files.- Returns:
- a new scan based on this that does not filter rows in files.
-
appendsBetween
TableScan appendsBetween(long fromSnapshotId, long toSnapshotId)
Create a newTableScanto read appended data fromfromSnapshotIdexclusive totoSnapshotIdinclusive.- Parameters:
fromSnapshotId- the last snapshot id read by the user, exclusivetoSnapshotId- read append data up to this snapshot id- Returns:
- a table scan which can read append data from
fromSnapshotIdexclusive and up totoSnapshotIdinclusive
-
appendsAfter
TableScan appendsAfter(long fromSnapshotId)
Create a newTableScanto read appended data fromfromSnapshotIdexclusive to the current snapshot inclusive.- Parameters:
fromSnapshotId- - the last snapshot id read by the user, exclusive- Returns:
- a table scan which can read append data from
fromSnapshotIdexclusive and up to current snapshot inclusive
-
planFiles
CloseableIterable<FileScanTask> planFiles()
Plan thefilesthat will be read by this scan.Each file has a residual expression that should be applied to filter the file's rows.
This simple plan returns file scans for each file from position 0 to the file's length. For planning that will combine small files, split large files, and attempt to balance work, use
planTasks()instead.- Returns:
- an Iterable of file tasks that are required by this scan
-
planTasks
CloseableIterable<CombinedScanTask> planTasks()
Plan thetasksfor this scan.Tasks created by this method may read partial input files, multiple input files, or both.
- Returns:
- an Iterable of tasks for this scan
-
schema
Schema schema()
Returns this scan's projectionSchema.If the projection schema was set directly using
project(Schema), returns that schema.If the projection schema was set by calling
select(Collection), returns a projection schema that includes the selected data fields and any fields used in the filter expression.- Returns:
- this scan's projection schema
-
snapshot
Snapshot snapshot()
Returns theSnapshotthat will be used by this scan.If the snapshot was not configured using
asOfTime(long)oruseSnapshot(long), the current table snapshot will be used.- Returns:
- the Snapshot this scan will use
-
isCaseSensitive
boolean isCaseSensitive()
Returns whether this scan should apply column name case sensitiveness as percaseSensitive(boolean).- Returns:
- true if case sensitive, false otherwise.
-
-