public abstract class SortStrategy extends BinPackStrategy
Currently the there is no file overlap detection and we will rewrite all files if REWRITE_ALL
is true (default: false). If this property is disabled any files that would be chosen by
BinPackStrategy
will be rewrite candidates.
In the future other algorithms for determining files to rewrite will be provided.
Modifier and Type | Field and Description |
---|---|
static java.lang.String |
REWRITE_ALL
Rewrites all files, regardless of their size.
|
static boolean |
REWRITE_ALL_DEFAULT |
DELETE_FILE_THRESHOLD, DELETE_FILE_THRESHOLD_DEFAULT, MAX_FILE_SIZE_BYTES, MAX_FILE_SIZE_DEFAULT_RATIO, MIN_FILE_SIZE_BYTES, MIN_FILE_SIZE_DEFAULT_RATIO, MIN_INPUT_FILES, MIN_INPUT_FILES_DEFAULT
Constructor and Description |
---|
SortStrategy() |
Modifier and Type | Method and Description |
---|---|
java.lang.String |
name()
Returns the name of this rewrite strategy
|
RewriteStrategy |
options(java.util.Map<java.lang.String,java.lang.String> options)
Sets options to be used with this strategy
|
java.lang.Iterable<java.util.List<FileScanTask>> |
planFileGroups(java.lang.Iterable<FileScanTask> dataFiles)
Groups file scans into lists which will be processed in a single executable unit.
|
java.lang.Iterable<FileScanTask> |
selectFilesToRewrite(java.lang.Iterable<FileScanTask> dataFiles)
Selects files which this strategy believes are valid targets to be rewritten.
|
protected SortOrder |
sortOrder() |
SortStrategy |
sortOrder(SortOrder order)
Sets the sort order to be used in this strategy when rewriting files
|
protected void |
validateOptions() |
java.util.Set<java.lang.String> |
validOptions()
Returns a set of options which this rewrite strategy can use.
|
inputFileSize, maxGroupSize, numOutputFiles, splitSize, targetFileSize, writeMaxFileSize
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
rewriteFiles, table
public static final java.lang.String REWRITE_ALL
public static final boolean REWRITE_ALL_DEFAULT
public SortStrategy sortOrder(SortOrder order)
order
- the order to useprotected SortOrder sortOrder()
public java.lang.String name()
RewriteStrategy
name
in interface RewriteStrategy
name
in class BinPackStrategy
public java.util.Set<java.lang.String> validOptions()
RewriteStrategy
validOptions
in interface RewriteStrategy
validOptions
in class BinPackStrategy
public RewriteStrategy options(java.util.Map<java.lang.String,java.lang.String> options)
RewriteStrategy
options
in interface RewriteStrategy
options
in class BinPackStrategy
public java.lang.Iterable<FileScanTask> selectFilesToRewrite(java.lang.Iterable<FileScanTask> dataFiles)
RewriteStrategy
selectFilesToRewrite
in interface RewriteStrategy
selectFilesToRewrite
in class BinPackStrategy
dataFiles
- iterable of FileScanTasks for files in a given partitionpublic java.lang.Iterable<java.util.List<FileScanTask>> planFileGroups(java.lang.Iterable<FileScanTask> dataFiles)
RewriteStrategy
planFileGroups
in interface RewriteStrategy
planFileGroups
in class BinPackStrategy
dataFiles
- iterable of FileScanTasks to be rewrittenprotected void validateOptions()