Package org.apache.iceberg.actions
Interface FileRewriter<T extends ContentScanTask<F>,F extends ContentFile<F>>
- 
- Type Parameters:
- T- the Java type of tasks to read content files
- F- the Java type of content files
 - All Known Implementing Classes:
- SizeBasedDataRewriter,- SizeBasedFileRewriter,- SizeBasedPositionDeletesRewriter
 
 public interface FileRewriter<T extends ContentScanTask<F>,F extends ContentFile<F>>A class for rewriting content files.The entire rewrite operation is broken down into pieces based on partitioning, and size-based groups within a partition. These subunits of the rewrite are referred to as file groups. A file group will be processed by a single framework "action". For example, in Spark this means that each group would be rewritten in its own Spark job. 
- 
- 
Method SummaryAll Methods Instance Methods Abstract Methods Default Methods Modifier and Type Method Description default java.lang.Stringdescription()Returns a description for this rewriter.voidinit(java.util.Map<java.lang.String,java.lang.String> options)Initializes this rewriter using provided options.java.lang.Iterable<java.util.List<T>>planFileGroups(java.lang.Iterable<T> tasks)Selects files which this rewriter believes are valid targets to be rewritten based on their scan tasks and groups those scan tasks into file groups.java.util.Set<F>rewrite(java.util.List<T> group)Rewrite a group of files represented by the given list of scan tasks.java.util.Set<java.lang.String>validOptions()Returns a set of supported options for this rewriter.
 
- 
- 
- 
Method Detail- 
descriptiondefault java.lang.String description() Returns a description for this rewriter.
 - 
validOptionsjava.util.Set<java.lang.String> validOptions() Returns a set of supported options for this rewriter. Only options specified in this list will be accepted at runtime. Any other options will be rejected.
 - 
initvoid init(java.util.Map<java.lang.String,java.lang.String> options) Initializes this rewriter using provided options.- Parameters:
- options- options to initialize this rewriter
 
 - 
planFileGroupsjava.lang.Iterable<java.util.List<T>> planFileGroups(java.lang.Iterable<T> tasks) Selects files which this rewriter believes are valid targets to be rewritten based on their scan tasks and groups those scan tasks into file groups. The file groups are then rewritten in a single executable unit, such as a Spark job.- Parameters:
- tasks- an iterable of scan task for files in a partition
- Returns:
- groups of scan tasks for files to be rewritten in a single executable unit
 
 - 
rewritejava.util.Set<F> rewrite(java.util.List<T> group) Rewrite a group of files represented by the given list of scan tasks.The implementation is supposed to be engine-specific (e.g. Spark, Flink, Trino). - Parameters:
- group- a group of scan tasks for files to be rewritten together
- Returns:
- a set of newly written files
 
 
- 
 
-