Class DataTableScan
- java.lang.Object
-
- org.apache.iceberg.DataTableScan
-
- All Implemented Interfaces:
TableScan
public class DataTableScan extends java.lang.Object
-
-
Constructor Summary
Constructors Modifier Constructor Description DataTableScan(TableOperations ops, Table table)
protected
DataTableScan(TableOperations ops, Table table, Schema schema, org.apache.iceberg.TableScanContext context)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description TableScan
appendsAfter(long fromSnapshotId)
Create a newTableScan
to read appended data fromfromSnapshotId
exclusive to the current snapshot inclusive.TableScan
appendsBetween(long fromSnapshotId, long toSnapshotId)
Create a newTableScan
to read appended data fromfromSnapshotId
exclusive totoSnapshotId
inclusive.TableScan
asOfTime(long timestampMillis)
Create a newTableScan
from this scan's configuration that will use the most recent snapshot as of the given time in milliseconds.TableScan
caseSensitive(boolean scanCaseSensitive)
Create a newTableScan
from this that, if data columns where selected viaTableScan.select(java.util.Collection)
, controls whether the match to the schema will be done with case sensitivity.protected boolean
colStats()
protected org.apache.iceberg.TableScanContext
context()
Expression
filter()
Returns this scan's filterExpression
.TableScan
filter(Expression expr)
Create a newTableScan
from the results of this filtered by theExpression
.TableScan
ignoreResiduals()
Create a newTableScan
from this that applies data filtering to files but not to rows in those files.TableScan
includeColumnStats()
Create a newTableScan
from this that loads the column stats with each data file.boolean
isCaseSensitive()
Returns whether this scan should apply column name case sensitiveness as perTableScan.caseSensitive(boolean)
.protected TableScan
newRefinedScan(TableOperations ops, Table table, Schema schema, org.apache.iceberg.TableScanContext context)
TableScan
option(java.lang.String property, java.lang.String value)
protected java.util.Map<java.lang.String,java.lang.String>
options()
CloseableIterable<FileScanTask>
planFiles()
Plan thefiles
that will be read by this scan.CloseableIterable<FileScanTask>
planFiles(TableOperations ops, Snapshot snapshot, Expression rowFilter, boolean ignoreResiduals, boolean caseSensitive, boolean colStats)
CloseableIterable<CombinedScanTask>
planTasks()
Plan thetasks
for this scan.TableScan
project(Schema projectedSchema)
Create a newTableScan
from this with the schema as its projection.Schema
schema()
Returns this scan's projectionSchema
.TableScan
select(java.util.Collection<java.lang.String> columns)
Create a newTableScan
from this that will read the given data columns.protected java.util.Collection<java.lang.String>
selectedColumns()
protected boolean
shouldIgnoreResiduals()
Snapshot
snapshot()
Returns theSnapshot
that will be used by this scan.protected java.lang.Long
snapshotId()
int
splitLookback()
Returns the split lookback for this scan.long
splitOpenFileCost()
Returns the split open file cost for this scan.Table
table()
Returns theTable
from which this scan loads data.protected TableOperations
tableOps()
protected Schema
tableSchema()
long
targetSplitSize()
Returns the target split size for this scan.java.lang.String
toString()
TableScan
useSnapshot(long scanSnapshotId)
Create a newTableScan
from this scan's configuration that will use the given snapshot by ID.
-
-
-
Constructor Detail
-
DataTableScan
public DataTableScan(TableOperations ops, Table table)
-
DataTableScan
protected DataTableScan(TableOperations ops, Table table, Schema schema, org.apache.iceberg.TableScanContext context)
-
-
Method Detail
-
appendsBetween
public TableScan appendsBetween(long fromSnapshotId, long toSnapshotId)
Description copied from interface:TableScan
Create a newTableScan
to read appended data fromfromSnapshotId
exclusive totoSnapshotId
inclusive.- Specified by:
appendsBetween
in interfaceTableScan
- Parameters:
fromSnapshotId
- the last snapshot id read by the user, exclusivetoSnapshotId
- read append data up to this snapshot id- Returns:
- a table scan which can read append data from
fromSnapshotId
exclusive and up totoSnapshotId
inclusive
-
appendsAfter
public TableScan appendsAfter(long fromSnapshotId)
Description copied from interface:TableScan
Create a newTableScan
to read appended data fromfromSnapshotId
exclusive to the current snapshot inclusive.- Specified by:
appendsAfter
in interfaceTableScan
- Parameters:
fromSnapshotId
- - the last snapshot id read by the user, exclusive- Returns:
- a table scan which can read append data from
fromSnapshotId
exclusive and up to current snapshot inclusive
-
useSnapshot
public TableScan useSnapshot(long scanSnapshotId)
Description copied from interface:TableScan
Create a newTableScan
from this scan's configuration that will use the given snapshot by ID.- Specified by:
useSnapshot
in interfaceTableScan
- Parameters:
scanSnapshotId
- a snapshot ID- Returns:
- a new scan based on this with the given snapshot ID
-
newRefinedScan
protected TableScan newRefinedScan(TableOperations ops, Table table, Schema schema, org.apache.iceberg.TableScanContext context)
-
planFiles
public CloseableIterable<FileScanTask> planFiles(TableOperations ops, Snapshot snapshot, Expression rowFilter, boolean ignoreResiduals, boolean caseSensitive, boolean colStats)
-
targetSplitSize
public long targetSplitSize()
Description copied from interface:TableScan
Returns the target split size for this scan.
-
tableSchema
protected Schema tableSchema()
-
tableOps
protected TableOperations tableOps()
-
snapshotId
protected java.lang.Long snapshotId()
-
colStats
protected boolean colStats()
-
shouldIgnoreResiduals
protected boolean shouldIgnoreResiduals()
-
selectedColumns
protected java.util.Collection<java.lang.String> selectedColumns()
-
options
protected java.util.Map<java.lang.String,java.lang.String> options()
-
context
protected org.apache.iceberg.TableScanContext context()
-
table
public Table table()
Description copied from interface:TableScan
Returns theTable
from which this scan loads data.
-
asOfTime
public TableScan asOfTime(long timestampMillis)
Description copied from interface:TableScan
Create a newTableScan
from this scan's configuration that will use the most recent snapshot as of the given time in milliseconds.
-
option
public TableScan option(java.lang.String property, java.lang.String value)
Description copied from interface:TableScan
-
project
public TableScan project(Schema projectedSchema)
Description copied from interface:TableScan
Create a newTableScan
from this with the schema as its projection.
-
caseSensitive
public TableScan caseSensitive(boolean scanCaseSensitive)
Description copied from interface:TableScan
Create a newTableScan
from this that, if data columns where selected viaTableScan.select(java.util.Collection)
, controls whether the match to the schema will be done with case sensitivity.- Specified by:
caseSensitive
in interfaceTableScan
- Returns:
- a new scan based on this with case sensitivity as stated
-
includeColumnStats
public TableScan includeColumnStats()
Description copied from interface:TableScan
Create a newTableScan
from this that loads the column stats with each data file.Column stats include: value count, null value count, lower bounds, and upper bounds.
- Specified by:
includeColumnStats
in interfaceTableScan
- Returns:
- a new scan based on this that loads column stats.
-
select
public TableScan select(java.util.Collection<java.lang.String> columns)
Description copied from interface:TableScan
Create a newTableScan
from this that will read the given data columns. This produces an expected schema that includes all fields that are either selected or used by this scan's filter expression.
-
filter
public TableScan filter(Expression expr)
Description copied from interface:TableScan
Create a newTableScan
from the results of this filtered by theExpression
.
-
filter
public Expression filter()
Description copied from interface:TableScan
Returns this scan's filterExpression
.
-
ignoreResiduals
public TableScan ignoreResiduals()
Description copied from interface:TableScan
Create a newTableScan
from this that applies data filtering to files but not to rows in those files.- Specified by:
ignoreResiduals
in interfaceTableScan
- Returns:
- a new scan based on this that does not filter rows in files.
-
planFiles
public CloseableIterable<FileScanTask> planFiles()
Description copied from interface:TableScan
Plan thefiles
that will be read by this scan.Each file has a residual expression that should be applied to filter the file's rows.
This simple plan returns file scans for each file from position 0 to the file's length. For planning that will combine small files, split large files, and attempt to balance work, use
TableScan.planTasks()
instead.
-
planTasks
public CloseableIterable<CombinedScanTask> planTasks()
Description copied from interface:TableScan
Plan thetasks
for this scan.Tasks created by this method may read partial input files, multiple input files, or both.
-
splitLookback
public int splitLookback()
Description copied from interface:TableScan
Returns the split lookback for this scan.- Specified by:
splitLookback
in interfaceTableScan
-
splitOpenFileCost
public long splitOpenFileCost()
Description copied from interface:TableScan
Returns the split open file cost for this scan.- Specified by:
splitOpenFileCost
in interfaceTableScan
-
schema
public Schema schema()
Description copied from interface:TableScan
Returns this scan's projectionSchema
.If the projection schema was set directly using
TableScan.project(Schema)
, returns that schema.If the projection schema was set by calling
TableScan.select(Collection)
, returns a projection schema that includes the selected data fields and any fields used in the filter expression.
-
snapshot
public Snapshot snapshot()
Description copied from interface:TableScan
Returns theSnapshot
that will be used by this scan.If the snapshot was not configured using
TableScan.asOfTime(long)
orTableScan.useSnapshot(long)
, the current table snapshot will be used.
-
isCaseSensitive
public boolean isCaseSensitive()
Description copied from interface:TableScan
Returns whether this scan should apply column name case sensitiveness as perTableScan.caseSensitive(boolean)
.- Specified by:
isCaseSensitive
in interfaceTableScan
- Returns:
- true if case sensitive, false otherwise.
-
toString
public java.lang.String toString()
- Overrides:
toString
in classjava.lang.Object
-
-