Interface FileRewriteRunner<I,T extends ContentScanTask<F>,F extends ContentFile<F>,G extends RewriteGroupBase<I,T,F>>

Type Parameters:
I - the Java type of the plan info like RewriteDataFiles.FileGroupInfo or RewritePositionDeleteFiles.FileGroupInfo
T - the Java type of the input scan tasks (input)
F - the Java type of the content files (input and output)
G - the Java type of the rewrite file group like RewriteFileGroup or RewritePositionDeletesGroup

public interface FileRewriteRunner<I,T extends ContentScanTask<F>,F extends ContentFile<F>,G extends RewriteGroupBase<I,T,F>>
A class for rewriting content file groups (RewriteGroupBase). The lifecycle for the runner looks like the following:
  • init(Map) initializes the runner with the configuration parameters
  • rewrite(RewriteGroupBase) called for every group in the plan to do the actual rewrite of the files, and returns the generated new files.
  • Method Summary

    Modifier and Type
    Method
    Description
    default String
    Returns a description for this runner.
    void
    init(Map<String,String> options)
    Initializes this runner using provided options.
    rewrite(G group)
    Rewrite a group of files represented by the given list of scan tasks.
    Returns a set of supported options for this runner.
  • Method Details

    • description

      default String description()
      Returns a description for this runner.
    • validOptions

      Set<String> validOptions()
      Returns a set of supported options for this runner. Only options specified in this list will be accepted at runtime. Any other options will be rejected.
    • init

      void init(Map<String,String> options)
      Initializes this runner using provided options.
      Parameters:
      options - options to initialize this runner
    • rewrite

      Set<F> rewrite(G group)
      Rewrite a group of files represented by the given list of scan tasks.

      The implementation is supposed to be engine-specific (e.g. Spark, Flink, Trino).

      Parameters:
      group - of scan tasks for files to be rewritten together
      Returns:
      a set of newly written files