Package org.apache.iceberg.actions
Class RewriteGroupBase<I,T extends ContentScanTask<F>,F extends ContentFile<F>>
java.lang.Object
org.apache.iceberg.actions.RewriteGroupBase<I,T,F>
- Type Parameters:
I
- the Java type of the plan info member likeRewriteDataFiles.FileGroupInfo
orRewritePositionDeleteFiles.FileGroupInfo
T
- the Java type of the input scan tasks (input)F
- the Java type of the content files (input and output)
- Direct Known Subclasses:
RewriteFileGroup
,RewritePositionDeletesGroup
public abstract class RewriteGroupBase<I,T extends ContentScanTask<F>,F extends ContentFile<F>>
extends Object
Container class representing a set of files to be rewritten by a
FileRewriteRunner
.-
Method Summary
Modifier and TypeMethodDescriptionint
The total number of files that should be produced by the rewrite of this entire file group.Scan tasks for input files.info()
Identifiers and partition information about the group.int
Number of the input files.long
Accumulated size for the input files.long
The amount of bytes of data theFileRewriteRunner
should read from a single group in a single read task.long
The target file size which should be used by theFileRewriteRunner
.
-
Method Details
-
info
Identifiers and partition information about the group. -
fileScanTasks
Scan tasks for input files. -
inputFilesSizeInBytes
public long inputFilesSizeInBytes()Accumulated size for the input files. -
inputFileNum
public int inputFileNum()Number of the input files. -
maxOutputFileSize
public long maxOutputFileSize()The target file size which should be used by theFileRewriteRunner
. TheFileRewritePlanner
could chose different values than defined by the table properties.- Returns:
- the target size should be used by the runner
-
inputSplitSize
public long inputSplitSize()The amount of bytes of data theFileRewriteRunner
should read from a single group in a single read task. TheFileRewritePlanner
chooses a value to allow parallelization for the runners, but prevent fragmentation of the output caused by too many readers. -
expectedOutputFiles
public int expectedOutputFiles()The total number of files that should be produced by the rewrite of this entire file group.
-