Class RewriteGroupBase<I,T extends ContentScanTask<F>,F extends ContentFile<F>>

java.lang.Object
org.apache.iceberg.actions.RewriteGroupBase<I,T,F>
Type Parameters:
I - the Java type of the plan info member like RewriteDataFiles.FileGroupInfo or RewritePositionDeleteFiles.FileGroupInfo
T - the Java type of the input scan tasks (input)
F - the Java type of the content files (input and output)
Direct Known Subclasses:
RewriteFileGroup, RewritePositionDeletesGroup

public abstract class RewriteGroupBase<I,T extends ContentScanTask<F>,F extends ContentFile<F>> extends Object
Container class representing a set of files to be rewritten by a FileRewriteRunner.
  • Method Details

    • info

      public I info()
      Identifiers and partition information about the group.
    • fileScanTasks

      public List<T> fileScanTasks()
      Scan tasks for input files.
    • inputFilesSizeInBytes

      public long inputFilesSizeInBytes()
      Accumulated size for the input files.
    • inputFileNum

      public int inputFileNum()
      Number of the input files.
    • maxOutputFileSize

      public long maxOutputFileSize()
      The target file size which should be used by the FileRewriteRunner. The FileRewritePlanner could chose different values than defined by the table properties.
      Returns:
      the target size should be used by the runner
    • inputSplitSize

      public long inputSplitSize()
      The amount of bytes of data the FileRewriteRunner should read from a single group in a single read task. The FileRewritePlanner chooses a value to allow parallelization for the runners, but prevent fragmentation of the output caused by too many readers.
    • expectedOutputFiles

      public int expectedOutputFiles()
      The total number of files that should be produced by the rewrite of this entire file group.