Package org.apache.iceberg
Class BaseFileScanTask
java.lang.Object
org.apache.iceberg.BaseFileScanTask
- All Implemented Interfaces:
Serializable
,ContentScanTask<DataFile>
,FileScanTask
,PartitionScanTask
,ScanTask
,SplittableScanTask<FileScanTask>
- See Also:
-
Constructor Summary
ConstructorDescriptionBaseFileScanTask
(DataFile file, DeleteFile[] deletes, String schemaString, String specString, ResidualEvaluator residuals) -
Method Summary
Modifier and TypeMethodDescriptiondeletes()
A list ofdelete files
to apply when reading the task's data file.long
The estimated number of rows produced by this scan task.file()
Thefile
to scan.int
The number of files that will be opened by this scan task.long
length()
The number of bytes to scan from theContentScanTask.start()
position in the file.protected FileScanTask
newSplitTask
(FileScanTask parentTask, long offset, long length) residual()
Returns the residual expression that should be applied to rows in this file scan.schema()
Return the schema for this file scan task.protected FileScanTask
self()
long
The number of bytes that should be read by this scan task.spec()
Returns the spec of the partition for this scan tasksplit
(long targetSplitSize) Attempts to split this scan task into several smaller scan tasks, each close tosplitSize
size.long
start()
The starting position of this scan range in the file.toString()
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
Methods inherited from interface org.apache.iceberg.ContentScanTask
estimatedRowsCount, file, length, partition, residual, start
Methods inherited from interface org.apache.iceberg.FileScanTask
asFileScanTask, isFileScanTask
Methods inherited from interface org.apache.iceberg.PartitionScanTask
spec
Methods inherited from interface org.apache.iceberg.ScanTask
asCombinedScanTask, asDataTask, isDataTask
Methods inherited from interface org.apache.iceberg.SplittableScanTask
split
-
Constructor Details
-
BaseFileScanTask
public BaseFileScanTask(DataFile file, DeleteFile[] deletes, String schemaString, String specString, ResidualEvaluator residuals)
-
-
Method Details
-
self
-
newSplitTask
-
deletes
Description copied from interface:FileScanTask
A list ofdelete files
to apply when reading the task's data file.- Specified by:
deletes
in interfaceFileScanTask
- Returns:
- a list of delete files to apply
-
sizeBytes
public long sizeBytes()Description copied from interface:ScanTask
The number of bytes that should be read by this scan task.- Specified by:
sizeBytes
in interfaceContentScanTask<DataFile>
- Specified by:
sizeBytes
in interfaceFileScanTask
- Specified by:
sizeBytes
in interfaceScanTask
- Returns:
- the total number of bytes to read
-
filesCount
public int filesCount()Description copied from interface:ScanTask
The number of files that will be opened by this scan task.- Specified by:
filesCount
in interfaceFileScanTask
- Specified by:
filesCount
in interfaceScanTask
- Returns:
- the number of files to open
-
schema
Description copied from interface:FileScanTask
Return the schema for this file scan task.- Specified by:
schema
in interfaceFileScanTask
-
file
Description copied from interface:ContentScanTask
Thefile
to scan.- Specified by:
file
in interfaceContentScanTask<ThisT extends ContentScanTask<F>>
- Returns:
- the file to scan
-
spec
Description copied from interface:PartitionScanTask
Returns the spec of the partition for this scan task- Specified by:
spec
in interfacePartitionScanTask
-
start
public long start()Description copied from interface:ContentScanTask
The starting position of this scan range in the file.- Specified by:
start
in interfaceContentScanTask<ThisT extends ContentScanTask<F>>
- Returns:
- the start position of this scan range
-
length
public long length()Description copied from interface:ContentScanTask
The number of bytes to scan from theContentScanTask.start()
position in the file.- Specified by:
length
in interfaceContentScanTask<ThisT extends ContentScanTask<F>>
- Returns:
- the length of this scan range in bytes
-
residual
Description copied from interface:ContentScanTask
Returns the residual expression that should be applied to rows in this file scan.The residual expression for a file is a filter expression created by partially evaluating the scan's filter using the file's partition data.
- Specified by:
residual
in interfaceContentScanTask<ThisT extends ContentScanTask<F>>
- Returns:
- a residual expression to apply to rows from this scan
-
estimatedRowsCount
public long estimatedRowsCount()Description copied from interface:ScanTask
The estimated number of rows produced by this scan task.- Specified by:
estimatedRowsCount
in interfaceContentScanTask<ThisT extends ContentScanTask<F>>
- Specified by:
estimatedRowsCount
in interfaceScanTask
- Returns:
- the estimated number of produced rows
-
split
Description copied from interface:SplittableScanTask
Attempts to split this scan task into several smaller scan tasks, each close tosplitSize
size.Note the target split size is just guidance and the actual split size may be either smaller or larger. File formats like Parquet may leverage the row group offset information while splitting tasks.
- Specified by:
split
in interfaceSplittableScanTask<ThisT extends ContentScanTask<F>>
- Parameters:
targetSplitSize
- the target size of each new scan task in bytes- Returns:
- an Iterable of smaller tasks
-
toString
-