Package org.apache.iceberg
Class FilesTable.FilesTableScan
- java.lang.Object
-
- org.apache.iceberg.SnapshotScan<TableScan,FileScanTask,CombinedScanTask>
-
- org.apache.iceberg.FilesTable.FilesTableScan
-
- All Implemented Interfaces:
Scan<TableScan,FileScanTask,CombinedScanTask>,TableScan
- Enclosing class:
- FilesTable
public static class FilesTable.FilesTableScan extends SnapshotScan<TableScan,FileScanTask,CombinedScanTask>
-
-
Field Summary
Fields Modifier and Type Field Description protected static java.util.List<java.lang.String>DELETE_SCAN_COLUMNSprotected static java.util.List<java.lang.String>DELETE_SCAN_WITH_STATS_COLUMNSprotected static booleanPLAN_SCANS_WITH_WORKER_POOLprotected static java.util.List<java.lang.String>SCAN_COLUMNSprotected static java.util.List<java.lang.String>SCAN_WITH_STATS_COLUMNS
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description TableScanappendsAfter(long fromSnapshotId)Create a newTableScanto read appended data fromfromSnapshotIdexclusive to the current snapshot inclusive.TableScanappendsBetween(long fromSnapshotId, long toSnapshotId)Create a newTableScanto read appended data fromfromSnapshotIdexclusive totoSnapshotIdinclusive.ThisTcaseSensitive(boolean caseSensitive)Create a new scan from this that, if data columns where selected viaScan.select(java.util.Collection), controls whether the match to the schema will be done with case sensitivity.protected java.util.Set<java.lang.Integer>columnsToKeepStats()protected org.apache.iceberg.TableScanContextcontext()protected CloseableIterable<FileScanTask>doPlanFiles()Expressionfilter()Returns this scan's filterExpression.ThisTfilter(Expression expr)Create a new scan from the results of this filtered by theExpression.ThisTignoreResiduals()Create a new scan from this that applies data filtering to files but not to rows in those files.ThisTincludeColumnStats()Create a new scan from this that loads the column stats with each data file.ThisTincludeColumnStats(java.util.Collection<java.lang.String> requestedColumns)Create a new scan from this that loads the column stats for the specific columns with each data file.protected FileIOio()booleanisCaseSensitive()Returns whether this scan is case-sensitive with respect to column names.protected CloseableIterable<ManifestFile>manifests()Returns an iterable of manifest files to explore for this files metadata table scanThisTmetricsReporter(MetricsReporter reporter)Create a new scan that will report scan metrics to the provided reporter in addition to reporters maintained by the scan.protected TableScannewRefinedScan(Table table, Schema schema, org.apache.iceberg.TableScanContext context)ThisToption(java.lang.String property, java.lang.String value)Create a new scan from this scan's configuration that will override theTable's behavior based on the incoming pair.protected java.util.Map<java.lang.String,java.lang.String>options()protected java.util.concurrent.ExecutorServiceplanExecutor()CloseableIterable<CombinedScanTask>planTasks()Plan balanced task groups for this scan by splitting large and combining small tasks.ThisTplanWith(java.util.concurrent.ExecutorService executorService)Create a new scan to use a particular executor to plan.ThisTproject(Schema projectedSchema)Create a new scan from this with the schema as its projection.protected ExpressionresidualFilter()protected java.util.List<java.lang.String>scanColumns()Schemaschema()Returns this scan's projectionSchema.ThisTselect(java.util.Collection<java.lang.String> columns)Create a new scan from this that will read the given data columns.protected booleanshouldIgnoreResiduals()protected booleanshouldPlanWithExecutor()protected booleanshouldReturnColumnStats()intsplitLookback()Returns the split lookback for this scan.longsplitOpenFileCost()Returns the split open file cost for this scan.Tabletable()protected SchematableSchema()protected MetadataTableTypetableType()Type of scan being performed, such asMetadataTableType.ALL_DATA_FILESwhen scanning a table'sAllDataFilesTable.longtargetSplitSize()Returns the target split size for this scan.-
Methods inherited from class org.apache.iceberg.SnapshotScan
asOfTime, planFiles, scanMetrics, snapshot, snapshotId, toString, useRef, useSnapshot, useSnapshotSchema
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
-
Methods inherited from interface org.apache.iceberg.Scan
caseSensitive, filter, filter, ignoreResiduals, includeColumnStats, includeColumnStats, isCaseSensitive, metricsReporter, option, planFiles, planWith, project, schema, select, select, splitLookback, splitOpenFileCost
-
-
-
-
Field Detail
-
SCAN_COLUMNS
protected static final java.util.List<java.lang.String> SCAN_COLUMNS
-
SCAN_WITH_STATS_COLUMNS
protected static final java.util.List<java.lang.String> SCAN_WITH_STATS_COLUMNS
-
DELETE_SCAN_COLUMNS
protected static final java.util.List<java.lang.String> DELETE_SCAN_COLUMNS
-
DELETE_SCAN_WITH_STATS_COLUMNS
protected static final java.util.List<java.lang.String> DELETE_SCAN_WITH_STATS_COLUMNS
-
PLAN_SCANS_WITH_WORKER_POOL
protected static final boolean PLAN_SCANS_WITH_WORKER_POOL
-
-
Method Detail
-
newRefinedScan
protected TableScan newRefinedScan(Table table, Schema schema, org.apache.iceberg.TableScanContext context)
-
manifests
protected CloseableIterable<ManifestFile> manifests()
Returns an iterable of manifest files to explore for this files metadata table scan
-
doPlanFiles
protected CloseableIterable<FileScanTask> doPlanFiles()
- Specified by:
doPlanFilesin classSnapshotScan<TableScan,FileScanTask,CombinedScanTask>
-
tableType
protected MetadataTableType tableType()
Type of scan being performed, such asMetadataTableType.ALL_DATA_FILESwhen scanning a table'sAllDataFilesTable.Used for logging and error messages.
-
appendsBetween
public TableScan appendsBetween(long fromSnapshotId, long toSnapshotId)
Description copied from interface:TableScanCreate a newTableScanto read appended data fromfromSnapshotIdexclusive totoSnapshotIdinclusive.- Specified by:
appendsBetweenin interfaceTableScan- Parameters:
fromSnapshotId- the last snapshot id read by the user, exclusivetoSnapshotId- read append data up to this snapshot id- Returns:
- a table scan which can read append data from
fromSnapshotIdexclusive and up totoSnapshotIdinclusive
-
appendsAfter
public TableScan appendsAfter(long fromSnapshotId)
Description copied from interface:TableScanCreate a newTableScanto read appended data fromfromSnapshotIdexclusive to the current snapshot inclusive.- Specified by:
appendsAfterin interfaceTableScan- Parameters:
fromSnapshotId- - the last snapshot id read by the user, exclusive- Returns:
- a table scan which can read append data from
fromSnapshotIdexclusive and up to current snapshot inclusive
-
targetSplitSize
public long targetSplitSize()
Description copied from interface:ScanReturns the target split size for this scan.- Specified by:
targetSplitSizein interfaceScan<TableScan,FileScanTask,CombinedScanTask>
-
planTasks
public CloseableIterable<CombinedScanTask> planTasks()
Description copied from interface:ScanPlan balanced task groups for this scan by splitting large and combining small tasks.Task groups created by this method may read partial input files, multiple input files or both.
- Specified by:
planTasksin interfaceScan<TableScan,FileScanTask,CombinedScanTask>- Returns:
- an Iterable of balanced task groups required by this scan
-
table
public Table table()
-
io
protected FileIO io()
-
tableSchema
protected Schema tableSchema()
-
context
protected org.apache.iceberg.TableScanContext context()
-
options
protected java.util.Map<java.lang.String,java.lang.String> options()
-
scanColumns
protected java.util.List<java.lang.String> scanColumns()
-
shouldReturnColumnStats
protected boolean shouldReturnColumnStats()
-
columnsToKeepStats
protected java.util.Set<java.lang.Integer> columnsToKeepStats()
-
shouldIgnoreResiduals
protected boolean shouldIgnoreResiduals()
-
residualFilter
protected Expression residualFilter()
-
shouldPlanWithExecutor
protected boolean shouldPlanWithExecutor()
-
planExecutor
protected java.util.concurrent.ExecutorService planExecutor()
-
option
public ThisT option(java.lang.String property, java.lang.String value)Description copied from interface:ScanCreate a new scan from this scan's configuration that will override theTable's behavior based on the incoming pair. Unknown properties will be ignored.- Specified by:
optionin interfaceScan<ThisT,T extends ScanTask,G extends ScanTaskGroup<T>>- Parameters:
property- name of the table property to be overriddenvalue- value to override with- Returns:
- a new scan based on this with overridden behavior
-
project
public ThisT project(Schema projectedSchema)
Description copied from interface:ScanCreate a new scan from this with the schema as its projection.- Specified by:
projectin interfaceScan<ThisT,T extends ScanTask,G extends ScanTaskGroup<T>>- Parameters:
projectedSchema- a projection schema- Returns:
- a new scan based on this with the given projection
-
caseSensitive
public ThisT caseSensitive(boolean caseSensitive)
Description copied from interface:ScanCreate a new scan from this that, if data columns where selected viaScan.select(java.util.Collection), controls whether the match to the schema will be done with case sensitivity. Default is true.- Specified by:
caseSensitivein interfaceScan<ThisT,T extends ScanTask,G extends ScanTaskGroup<T>>- Returns:
- a new scan based on this with case sensitivity as stated
-
isCaseSensitive
public boolean isCaseSensitive()
Description copied from interface:ScanReturns whether this scan is case-sensitive with respect to column names.- Specified by:
isCaseSensitivein interfaceScan<ThisT,T extends ScanTask,G extends ScanTaskGroup<T>>- Returns:
- true if case-sensitive, false otherwise.
-
includeColumnStats
public ThisT includeColumnStats()
Description copied from interface:ScanCreate a new scan from this that loads the column stats with each data file.Column stats include: value count, null value count, lower bounds, and upper bounds.
- Specified by:
includeColumnStatsin interfaceScan<ThisT,T extends ScanTask,G extends ScanTaskGroup<T>>- Returns:
- a new scan based on this that loads column stats.
-
includeColumnStats
public ThisT includeColumnStats(java.util.Collection<java.lang.String> requestedColumns)
Description copied from interface:ScanCreate a new scan from this that loads the column stats for the specific columns with each data file.Column stats include: value count, null value count, lower bounds, and upper bounds.
- Specified by:
includeColumnStatsin interfaceScan<ThisT,T extends ScanTask,G extends ScanTaskGroup<T>>- Parameters:
requestedColumns- column names for which to keep the stats.- Returns:
- a new scan based on this that loads column stats for specific columns.
-
select
public ThisT select(java.util.Collection<java.lang.String> columns)
Description copied from interface:ScanCreate a new scan from this that will read the given data columns. This produces an expected schema that includes all fields that are either selected or used by this scan's filter expression.- Specified by:
selectin interfaceScan<ThisT,T extends ScanTask,G extends ScanTaskGroup<T>>- Parameters:
columns- column names from the table's schema- Returns:
- a new scan based on this with the given projection columns
-
filter
public ThisT filter(Expression expr)
Description copied from interface:ScanCreate a new scan from the results of this filtered by theExpression.- Specified by:
filterin interfaceScan<ThisT,T extends ScanTask,G extends ScanTaskGroup<T>>- Parameters:
expr- a filter expression- Returns:
- a new scan based on this with results filtered by the expression
-
filter
public Expression filter()
Description copied from interface:ScanReturns this scan's filterExpression.- Specified by:
filterin interfaceScan<ThisT,T extends ScanTask,G extends ScanTaskGroup<T>>- Returns:
- this scan's filter expression
-
ignoreResiduals
public ThisT ignoreResiduals()
Description copied from interface:ScanCreate a new scan from this that applies data filtering to files but not to rows in those files.- Specified by:
ignoreResidualsin interfaceScan<ThisT,T extends ScanTask,G extends ScanTaskGroup<T>>- Returns:
- a new scan based on this that does not filter rows in files.
-
planWith
public ThisT planWith(java.util.concurrent.ExecutorService executorService)
Description copied from interface:ScanCreate a new scan to use a particular executor to plan. The default worker pool will be used by default.- Specified by:
planWithin interfaceScan<ThisT,T extends ScanTask,G extends ScanTaskGroup<T>>- Parameters:
executorService- the provided executor- Returns:
- a table scan that uses the provided executor to access manifests
-
schema
public Schema schema()
Description copied from interface:ScanReturns this scan's projectionSchema.If the projection schema was set directly using
Scan.project(Schema), returns that schema.If the projection schema was set by calling
Scan.select(Collection), returns a projection schema that includes the selected data fields and any fields used in the filter expression.- Specified by:
schemain interfaceScan<ThisT,T extends ScanTask,G extends ScanTaskGroup<T>>- Returns:
- this scan's projection schema
-
splitLookback
public int splitLookback()
Description copied from interface:ScanReturns the split lookback for this scan.- Specified by:
splitLookbackin interfaceScan<ThisT,T extends ScanTask,G extends ScanTaskGroup<T>>
-
splitOpenFileCost
public long splitOpenFileCost()
Description copied from interface:ScanReturns the split open file cost for this scan.- Specified by:
splitOpenFileCostin interfaceScan<ThisT,T extends ScanTask,G extends ScanTaskGroup<T>>
-
metricsReporter
public ThisT metricsReporter(MetricsReporter reporter)
Description copied from interface:ScanCreate a new scan that will report scan metrics to the provided reporter in addition to reporters maintained by the scan.- Specified by:
metricsReporterin interfaceScan<ThisT,T extends ScanTask,G extends ScanTaskGroup<T>>
-
-