public static class DataFilesTable.DataFilesTableScan extends SnapshotScan<TableScan,FileScanTask,CombinedScanTask>
| Modifier and Type | Field and Description |
|---|---|
protected static java.util.List<java.lang.String> |
DELETE_SCAN_COLUMNS |
protected static java.util.List<java.lang.String> |
DELETE_SCAN_WITH_STATS_COLUMNS |
protected static boolean |
PLAN_SCANS_WITH_WORKER_POOL |
protected static java.util.List<java.lang.String> |
SCAN_COLUMNS |
protected static java.util.List<java.lang.String> |
SCAN_WITH_STATS_COLUMNS |
| Modifier and Type | Method and Description |
|---|---|
TableScan |
appendsAfter(long fromSnapshotId)
Create a new
TableScan to read appended data from fromSnapshotId exclusive to
the current snapshot inclusive. |
TableScan |
appendsBetween(long fromSnapshotId,
long toSnapshotId)
Create a new
TableScan to read appended data from fromSnapshotId exclusive to
toSnapshotId inclusive. |
ThisT |
caseSensitive(boolean caseSensitive)
Create a new scan from this that, if data columns where selected via
Scan.select(java.util.Collection), controls whether the match to the schema will be done with case
sensitivity. |
protected java.util.Set<java.lang.Integer> |
columnsToKeepStats() |
protected org.apache.iceberg.TableScanContext |
context() |
protected CloseableIterable<FileScanTask> |
doPlanFiles() |
Expression |
filter()
Returns this scan's filter
Expression. |
ThisT |
filter(Expression expr)
Create a new scan from the results of this filtered by the
Expression. |
ThisT |
ignoreResiduals()
Create a new scan from this that applies data filtering to files but not to rows in those
files.
|
ThisT |
includeColumnStats()
Create a new scan from this that loads the column stats with each data file.
|
ThisT |
includeColumnStats(java.util.Collection<java.lang.String> requestedColumns)
Create a new scan from this that loads the column stats for the specific columns with each data
file.
|
protected FileIO |
io() |
boolean |
isCaseSensitive()
Returns whether this scan is case-sensitive with respect to column names.
|
protected CloseableIterable<ManifestFile> |
manifests()
Returns an iterable of manifest files to explore for this files metadata table scan
|
ThisT |
metricsReporter(MetricsReporter reporter)
Create a new scan that will report scan metrics to the provided reporter in addition to
reporters maintained by the scan.
|
protected TableScan |
newRefinedScan(Table table,
Schema schema,
org.apache.iceberg.TableScanContext context) |
ThisT |
option(java.lang.String property,
java.lang.String value)
Create a new scan from this scan's configuration that will override the
Table's
behavior based on the incoming pair. |
protected java.util.Map<java.lang.String,java.lang.String> |
options() |
protected java.util.concurrent.ExecutorService |
planExecutor() |
CloseableIterable<CombinedScanTask> |
planTasks()
Plan balanced task groups for this scan by splitting large and combining small tasks.
|
ThisT |
planWith(java.util.concurrent.ExecutorService executorService)
Create a new scan to use a particular executor to plan.
|
ThisT |
project(Schema projectedSchema)
Create a new scan from this with the schema as its projection.
|
protected Expression |
residualFilter() |
protected java.util.List<java.lang.String> |
scanColumns() |
Schema |
schema()
Returns this scan's projection
Schema. |
ThisT |
select(java.util.Collection<java.lang.String> columns)
Create a new scan from this that will read the given data columns.
|
protected boolean |
shouldIgnoreResiduals() |
protected boolean |
shouldPlanWithExecutor() |
protected boolean |
shouldReturnColumnStats() |
int |
splitLookback()
Returns the split lookback for this scan.
|
long |
splitOpenFileCost()
Returns the split open file cost for this scan.
|
Table |
table() |
protected Schema |
tableSchema() |
protected MetadataTableType |
tableType()
Type of scan being performed, such as
MetadataTableType.ALL_DATA_FILES when scanning a
table's AllDataFilesTable. |
long |
targetSplitSize()
Returns the target split size for this scan.
|
asOfTime, planFiles, scanMetrics, snapshot, snapshotId, toString, useRef, useSnapshot, useSnapshotSchemaclone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, waitasOfTime, snapshot, table, useRef, useSnapshotcaseSensitive, filter, filter, ignoreResiduals, includeColumnStats, includeColumnStats, isCaseSensitive, metricsReporter, option, planFiles, planWith, project, schema, select, select, splitLookback, splitOpenFileCostprotected static final java.util.List<java.lang.String> SCAN_COLUMNS
protected static final java.util.List<java.lang.String> SCAN_WITH_STATS_COLUMNS
protected static final java.util.List<java.lang.String> DELETE_SCAN_COLUMNS
protected static final java.util.List<java.lang.String> DELETE_SCAN_WITH_STATS_COLUMNS
protected static final boolean PLAN_SCANS_WITH_WORKER_POOL
protected TableScan newRefinedScan(Table table, Schema schema, org.apache.iceberg.TableScanContext context)
protected CloseableIterable<ManifestFile> manifests()
protected CloseableIterable<FileScanTask> doPlanFiles()
doPlanFiles in class SnapshotScan<TableScan,FileScanTask,CombinedScanTask>protected MetadataTableType tableType()
MetadataTableType.ALL_DATA_FILES when scanning a
table's AllDataFilesTable.
Used for logging and error messages.
public TableScan appendsBetween(long fromSnapshotId, long toSnapshotId)
TableScanTableScan to read appended data from fromSnapshotId exclusive to
toSnapshotId inclusive.appendsBetween in interface TableScanfromSnapshotId - the last snapshot id read by the user, exclusivetoSnapshotId - read append data up to this snapshot idfromSnapshotId exclusive and up to
toSnapshotId inclusivepublic TableScan appendsAfter(long fromSnapshotId)
TableScanTableScan to read appended data from fromSnapshotId exclusive to
the current snapshot inclusive.appendsAfter in interface TableScanfromSnapshotId - - the last snapshot id read by the user, exclusivefromSnapshotId exclusive and up to
current snapshot inclusivepublic long targetSplitSize()
ScantargetSplitSize in interface Scan<TableScan,FileScanTask,CombinedScanTask>public CloseableIterable<CombinedScanTask> planTasks()
ScanTask groups created by this method may read partial input files, multiple input files or both.
planTasks in interface Scan<TableScan,FileScanTask,CombinedScanTask>public Table table()
protected FileIO io()
protected Schema tableSchema()
protected org.apache.iceberg.TableScanContext context()
protected java.util.Map<java.lang.String,java.lang.String> options()
protected java.util.List<java.lang.String> scanColumns()
protected boolean shouldReturnColumnStats()
protected java.util.Set<java.lang.Integer> columnsToKeepStats()
protected boolean shouldIgnoreResiduals()
protected Expression residualFilter()
protected boolean shouldPlanWithExecutor()
protected java.util.concurrent.ExecutorService planExecutor()
public ThisT option(java.lang.String property,
java.lang.String value)
ScanTable's
behavior based on the incoming pair. Unknown properties will be ignored.option in interface Scan<ThisT,T extends ScanTask,G extends ScanTaskGroup<T>>property - name of the table property to be overriddenvalue - value to override withpublic ThisT project(Schema projectedSchema)
Scanproject in interface Scan<ThisT,T extends ScanTask,G extends ScanTaskGroup<T>>projectedSchema - a projection schemapublic ThisT caseSensitive(boolean caseSensitive)
ScanScan.select(java.util.Collection), controls whether the match to the schema will be done with case
sensitivity. Default is true.caseSensitive in interface Scan<ThisT,T extends ScanTask,G extends ScanTaskGroup<T>>public boolean isCaseSensitive()
ScanisCaseSensitive in interface Scan<ThisT,T extends ScanTask,G extends ScanTaskGroup<T>>public ThisT includeColumnStats()
ScanColumn stats include: value count, null value count, lower bounds, and upper bounds.
includeColumnStats in interface Scan<ThisT,T extends ScanTask,G extends ScanTaskGroup<T>>public ThisT includeColumnStats(java.util.Collection<java.lang.String> requestedColumns)
ScanColumn stats include: value count, null value count, lower bounds, and upper bounds.
includeColumnStats in interface Scan<ThisT,T extends ScanTask,G extends ScanTaskGroup<T>>requestedColumns - column names for which to keep the stats.public ThisT select(java.util.Collection<java.lang.String> columns)
Scanselect in interface Scan<ThisT,T extends ScanTask,G extends ScanTaskGroup<T>>columns - column names from the table's schemapublic ThisT filter(Expression expr)
ScanExpression.filter in interface Scan<ThisT,T extends ScanTask,G extends ScanTaskGroup<T>>expr - a filter expressionpublic Expression filter()
ScanExpression.filter in interface Scan<ThisT,T extends ScanTask,G extends ScanTaskGroup<T>>public ThisT ignoreResiduals()
ScanignoreResiduals in interface Scan<ThisT,T extends ScanTask,G extends ScanTaskGroup<T>>public ThisT planWith(java.util.concurrent.ExecutorService executorService)
ScanplanWith in interface Scan<ThisT,T extends ScanTask,G extends ScanTaskGroup<T>>executorService - the provided executorpublic Schema schema()
ScanSchema.
If the projection schema was set directly using Scan.project(Schema), returns that
schema.
If the projection schema was set by calling Scan.select(Collection), returns a
projection schema that includes the selected data fields and any fields used in the filter
expression.
schema in interface Scan<ThisT,T extends ScanTask,G extends ScanTaskGroup<T>>public int splitLookback()
ScansplitLookback in interface Scan<ThisT,T extends ScanTask,G extends ScanTaskGroup<T>>public long splitOpenFileCost()
ScansplitOpenFileCost in interface Scan<ThisT,T extends ScanTask,G extends ScanTaskGroup<T>>public ThisT metricsReporter(MetricsReporter reporter)
ScanmetricsReporter in interface Scan<ThisT,T extends ScanTask,G extends ScanTaskGroup<T>>