public class DataTableScan
extends java.lang.Object
| Modifier | Constructor and Description |
|---|---|
|
DataTableScan(TableOperations ops,
Table table) |
protected |
DataTableScan(TableOperations ops,
Table table,
Schema schema,
org.apache.iceberg.TableScanContext context) |
| Modifier and Type | Method and Description |
|---|---|
TableScan |
appendsAfter(long fromSnapshotId)
Create a new
TableScan to read appended data from fromSnapshotId exclusive to the current snapshot
inclusive. |
TableScan |
appendsBetween(long fromSnapshotId,
long toSnapshotId)
Create a new
TableScan to read appended data from fromSnapshotId exclusive to toSnapshotId
inclusive. |
TableScan |
asOfTime(long timestampMillis)
Create a new
TableScan from this scan's configuration that will use the most recent
snapshot as of the given time in milliseconds. |
TableScan |
caseSensitive(boolean scanCaseSensitive)
Create a new
TableScan from this that, if data columns where selected
via TableScan.select(java.util.Collection), controls whether the match to the schema will be done
with case sensitivity. |
protected boolean |
colStats() |
protected org.apache.iceberg.TableScanContext |
context() |
Expression |
filter()
Returns this scan's filter
Expression. |
TableScan |
filter(Expression expr)
Create a new
TableScan from the results of this filtered by the Expression. |
TableScan |
ignoreResiduals()
Create a new
TableScan from this that applies data filtering to files but not to rows in those files. |
TableScan |
includeColumnStats()
Create a new
TableScan from this that loads the column stats with each data file. |
boolean |
isCaseSensitive()
Returns whether this scan should apply column name case sensitiveness as per
TableScan.caseSensitive(boolean). |
protected TableScan |
newRefinedScan(TableOperations ops,
Table table,
Schema schema,
org.apache.iceberg.TableScanContext context) |
TableScan |
option(java.lang.String property,
java.lang.String value)
|
protected java.util.Map<java.lang.String,java.lang.String> |
options() |
CloseableIterable<FileScanTask> |
planFiles()
Plan the
files that will be read by this scan. |
CloseableIterable<FileScanTask> |
planFiles(TableOperations ops,
Snapshot snapshot,
Expression rowFilter,
boolean ignoreResiduals,
boolean caseSensitive,
boolean colStats) |
CloseableIterable<CombinedScanTask> |
planTasks()
Plan the
tasks for this scan. |
TableScan |
project(Schema projectedSchema)
Create a new
TableScan from this with the schema as its projection. |
Schema |
schema()
Returns this scan's projection
Schema. |
TableScan |
select(java.util.Collection<java.lang.String> columns)
Create a new
TableScan from this that will read the given data columns. |
protected java.util.Collection<java.lang.String> |
selectedColumns() |
protected boolean |
shouldIgnoreResiduals() |
Snapshot |
snapshot()
Returns the
Snapshot that will be used by this scan. |
protected java.lang.Long |
snapshotId() |
int |
splitLookback()
Returns the split lookback for this scan.
|
long |
splitOpenFileCost()
Returns the split open file cost for this scan.
|
Table |
table()
Returns the
Table from which this scan loads data. |
protected TableOperations |
tableOps() |
long |
targetSplitSize()
Returns the target split size for this scan.
|
java.lang.String |
toString() |
TableScan |
useSnapshot(long scanSnapshotId)
Create a new
TableScan from this scan's configuration that will use the given snapshot
by ID. |
public DataTableScan(TableOperations ops, Table table)
protected DataTableScan(TableOperations ops, Table table, Schema schema, org.apache.iceberg.TableScanContext context)
public TableScan appendsBetween(long fromSnapshotId, long toSnapshotId)
TableScanTableScan to read appended data from fromSnapshotId exclusive to toSnapshotId
inclusive.appendsBetween in interface TableScanfromSnapshotId - the last snapshot id read by the user, exclusivetoSnapshotId - read append data up to this snapshot idfromSnapshotId
exclusive and up to toSnapshotId inclusivepublic TableScan appendsAfter(long fromSnapshotId)
TableScanTableScan to read appended data from fromSnapshotId exclusive to the current snapshot
inclusive.appendsAfter in interface TableScanfromSnapshotId - - the last snapshot id read by the user, exclusivefromSnapshotId
exclusive and up to current snapshot inclusivepublic TableScan useSnapshot(long scanSnapshotId)
TableScanTableScan from this scan's configuration that will use the given snapshot
by ID.useSnapshot in interface TableScanscanSnapshotId - a snapshot IDprotected TableScan newRefinedScan(TableOperations ops, Table table, Schema schema, org.apache.iceberg.TableScanContext context)
public CloseableIterable<FileScanTask> planFiles(TableOperations ops, Snapshot snapshot, Expression rowFilter, boolean ignoreResiduals, boolean caseSensitive, boolean colStats)
public long targetSplitSize()
TableScanprotected TableOperations tableOps()
protected java.lang.Long snapshotId()
protected boolean colStats()
protected boolean shouldIgnoreResiduals()
protected java.util.Collection<java.lang.String> selectedColumns()
protected java.util.Map<java.lang.String,java.lang.String> options()
protected org.apache.iceberg.TableScanContext context()
public Table table()
TableScanTable from which this scan loads data.public TableScan asOfTime(long timestampMillis)
TableScanTableScan from this scan's configuration that will use the most recent
snapshot as of the given time in milliseconds.public TableScan option(java.lang.String property, java.lang.String value)
TableScanpublic TableScan project(Schema projectedSchema)
TableScanTableScan from this with the schema as its projection.public TableScan caseSensitive(boolean scanCaseSensitive)
TableScanTableScan from this that, if data columns where selected
via TableScan.select(java.util.Collection), controls whether the match to the schema will be done
with case sensitivity.caseSensitive in interface TableScanpublic TableScan includeColumnStats()
TableScanTableScan from this that loads the column stats with each data file.
Column stats include: value count, null value count, lower bounds, and upper bounds.
includeColumnStats in interface TableScanpublic TableScan select(java.util.Collection<java.lang.String> columns)
TableScanTableScan from this that will read the given data columns. This produces
an expected schema that includes all fields that are either selected or used by this scan's
filter expression.public TableScan filter(Expression expr)
TableScanTableScan from the results of this filtered by the Expression.public Expression filter()
TableScanExpression.public TableScan ignoreResiduals()
TableScanTableScan from this that applies data filtering to files but not to rows in those files.ignoreResiduals in interface TableScanpublic CloseableIterable<FileScanTask> planFiles()
TableScanfiles that will be read by this scan.
Each file has a residual expression that should be applied to filter the file's rows.
This simple plan returns file scans for each file from position 0 to the file's length. For
planning that will combine small files, split large files, and attempt to balance work, use
TableScan.planTasks() instead.
public CloseableIterable<CombinedScanTask> planTasks()
TableScantasks for this scan.
Tasks created by this method may read partial input files, multiple input files, or both.
public int splitLookback()
TableScansplitLookback in interface TableScanpublic long splitOpenFileCost()
TableScansplitOpenFileCost in interface TableScanpublic Schema schema()
TableScanSchema.
If the projection schema was set directly using TableScan.project(Schema), returns that schema.
If the projection schema was set by calling TableScan.select(Collection), returns a projection
schema that includes the selected data fields and any fields used in the filter expression.
public Snapshot snapshot()
TableScanSnapshot that will be used by this scan.
If the snapshot was not configured using TableScan.asOfTime(long) or TableScan.useSnapshot(long), the current table
snapshot will be used.
public boolean isCaseSensitive()
TableScanTableScan.caseSensitive(boolean).isCaseSensitive in interface TableScanpublic java.lang.String toString()
toString in class java.lang.Object