Package org.apache.iceberg.actions
Class SortStrategy
- java.lang.Object
-
- org.apache.iceberg.actions.BinPackStrategy
-
- org.apache.iceberg.actions.SortStrategy
-
- All Implemented Interfaces:
java.io.Serializable
,RewriteStrategy
- Direct Known Subclasses:
SparkSortStrategy
public abstract class SortStrategy extends BinPackStrategy
A rewrite strategy for data files which aims to reorder data with data files to optimally lay them out in relation to a column. For example, if the Sort strategy is used on a set of files which is ordered by column x and original has files File A (x: 0 - 50), File B ( x: 10 - 40) and File C ( x: 30 - 60), this Strategy will attempt to rewrite those files into File A' (x: 0-20), File B' (x: 21 - 40), File C' (x: 41 - 60).Currently the there is no file overlap detection and we will rewrite all files if
BinPackStrategy.REWRITE_ALL
is true (default: false). If this property is disabled any files that would be chosen byBinPackStrategy
will be rewrite candidates.In the future other algorithms for determining files to rewrite will be provided.
- See Also:
- Serialized Form
-
-
Field Summary
-
Fields inherited from class org.apache.iceberg.actions.BinPackStrategy
DELETE_FILE_THRESHOLD, DELETE_FILE_THRESHOLD_DEFAULT, MAX_FILE_SIZE_BYTES, MAX_FILE_SIZE_DEFAULT_RATIO, MIN_FILE_SIZE_BYTES, MIN_FILE_SIZE_DEFAULT_RATIO, MIN_INPUT_FILES, MIN_INPUT_FILES_DEFAULT, REWRITE_ALL, REWRITE_ALL_DEFAULT
-
-
Constructor Summary
Constructors Constructor Description SortStrategy()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description java.lang.String
name()
Returns the name of this rewrite strategyRewriteStrategy
options(java.util.Map<java.lang.String,java.lang.String> options)
Sets options to be used with this strategyprotected SortOrder
sortOrder()
SortStrategy
sortOrder(SortOrder order)
Sets the sort order to be used in this strategy when rewriting filesprotected void
validateOptions()
java.util.Set<java.lang.String>
validOptions()
Returns a set of options which this rewrite strategy can use.-
Methods inherited from class org.apache.iceberg.actions.BinPackStrategy
inputFileSize, numOutputFiles, planFileGroups, selectFilesToRewrite, splitSize, targetFileSize, writeMaxFileSize
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface org.apache.iceberg.actions.RewriteStrategy
rewriteFiles, table
-
-
-
-
Method Detail
-
sortOrder
public SortStrategy sortOrder(SortOrder order)
Sets the sort order to be used in this strategy when rewriting files- Parameters:
order
- the order to use- Returns:
- this for method chaining
-
sortOrder
protected SortOrder sortOrder()
-
name
public java.lang.String name()
Description copied from interface:RewriteStrategy
Returns the name of this rewrite strategy- Specified by:
name
in interfaceRewriteStrategy
- Overrides:
name
in classBinPackStrategy
-
validOptions
public java.util.Set<java.lang.String> validOptions()
Description copied from interface:RewriteStrategy
Returns a set of options which this rewrite strategy can use. This is an allowed-list and any options not specified here will be rejected at runtime.- Specified by:
validOptions
in interfaceRewriteStrategy
- Overrides:
validOptions
in classBinPackStrategy
-
options
public RewriteStrategy options(java.util.Map<java.lang.String,java.lang.String> options)
Description copied from interface:RewriteStrategy
Sets options to be used with this strategy- Specified by:
options
in interfaceRewriteStrategy
- Overrides:
options
in classBinPackStrategy
-
validateOptions
protected void validateOptions()
-
-