Class RewriteDataFilesAction

    • Method Detail

      • table

        protected Table table()
      • outputSpecId

        public RewriteDataFilesAction outputSpecId​(int specId)
        Pass a PartitionSpec id to specify which PartitionSpec should be used in DataFile rewrite
        Parameters:
        specId - PartitionSpec id to rewrite
        Returns:
        this for method chaining
      • targetSizeInBytes

        public RewriteDataFilesAction targetSizeInBytes​(long targetSize)
        Specify the target rewrite data file size in bytes
        Parameters:
        targetSize - size in bytes of rewrite data file
        Returns:
        this for method chaining
      • splitLookback

        public RewriteDataFilesAction splitLookback​(int lookback)
        Specify the number of "bins" considered when trying to pack the next file split into a task. Increasing this usually makes tasks a bit more even by considering more ways to pack file regions into a single task with extra planning cost.

        This configuration can reorder the incoming file regions, to preserve order for lower/upper bounds in file metadata, user can use a lookback of 1.

        Parameters:
        lookback - number of "bins" considered when trying to pack the next file split into a task.
        Returns:
        this for method chaining
      • splitOpenFileCost

        public RewriteDataFilesAction splitOpenFileCost​(long openFileCost)
        Specify the minimum file size to count to pack into one "bin". If the read file size is smaller than this specified threshold, Iceberg will use this value to do count.

        this configuration controls the number of files to compact for each task, small value would lead to a high compaction, the default value is 4MB.

        Parameters:
        openFileCost - minimum file size to count to pack into one "bin".
        Returns:
        this for method chaining
      • filter

        public RewriteDataFilesAction filter​(Expression expr)
        Pass a row Expression to filter DataFiles to be rewritten. Note that all files that may contain data matching the filter may be rewritten.
        Parameters:
        expr - Expression to filter out DataFiles
        Returns:
        this for method chaining
      • set

        public ThisT set​(java.lang.String property,
                         java.lang.String value)
        Specified by:
        set in interface SnapshotUpdateAction<ThisT,​R>
      • metadataTableName

        protected java.lang.String metadataTableName​(MetadataTableType type)