Package org.apache.iceberg.spark.actions
Class RewritePositionDeleteFilesSparkAction
java.lang.Object
org.apache.iceberg.spark.actions.RewritePositionDeleteFilesSparkAction
- All Implemented Interfaces:
- Action<RewritePositionDeleteFiles,,- RewritePositionDeleteFiles.Result> - RewritePositionDeleteFiles,- SnapshotUpdate<RewritePositionDeleteFiles,- RewritePositionDeleteFiles.Result> 
public class RewritePositionDeleteFilesSparkAction
extends Object
implements RewritePositionDeleteFiles
Spark implementation of 
RewritePositionDeleteFiles.- 
Nested Class SummaryNested classes/interfaces inherited from interface org.apache.iceberg.actions.RewritePositionDeleteFilesRewritePositionDeleteFiles.FileGroupInfo, RewritePositionDeleteFiles.FileGroupRewriteResult, RewritePositionDeleteFiles.Result
- 
Field SummaryFieldsModifier and TypeFieldDescriptionprotected static final org.apache.iceberg.relocated.com.google.common.base.Joinerprotected static final org.apache.iceberg.relocated.com.google.common.base.Splitterprotected static final Stringprotected static final Stringprotected static final Stringprotected static final Stringprotected static final Stringprotected static final StringFields inherited from interface org.apache.iceberg.actions.RewritePositionDeleteFilesMAX_CONCURRENT_FILE_GROUP_REWRITES, MAX_CONCURRENT_FILE_GROUP_REWRITES_DEFAULT, PARTIAL_PROGRESS_ENABLED, PARTIAL_PROGRESS_ENABLED_DEFAULT, PARTIAL_PROGRESS_MAX_COMMITS, PARTIAL_PROGRESS_MAX_COMMITS_DEFAULT, REWRITE_JOB_ORDER, REWRITE_JOB_ORDER_DEFAULT
- 
Method SummaryModifier and TypeMethodDescriptionprotected org.apache.spark.sql.Dataset<FileInfo> protected voidcommit(SnapshotUpdate<?> update) protected org.apache.spark.sql.Dataset<FileInfo> contentFileDS(Table table) protected org.apache.spark.sql.Dataset<FileInfo> contentFileDS(Table table, Set<Long> snapshotIds) protected org.apache.iceberg.spark.actions.BaseSparkAction.DeleteSummarydeleteFiles(ExecutorService executorService, Consumer<String> deleteFunc, Iterator<FileInfo> files) Deletes files and keeps track of how many files were removed for each file type.protected org.apache.iceberg.spark.actions.BaseSparkAction.DeleteSummarydeleteFiles(SupportsBulkOperations io, Iterator<FileInfo> files) execute()Executes this action.filter(Expression expression) A filter for finding deletes to rewrite.protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> loadMetadataTable(Table table, MetadataTableType type) protected org.apache.spark.sql.Dataset<FileInfo> manifestDS(Table table) protected org.apache.spark.sql.Dataset<FileInfo> manifestDS(Table table, Set<Long> snapshotIds) protected org.apache.spark.sql.Dataset<FileInfo> manifestListDS(Table table) protected org.apache.spark.sql.Dataset<FileInfo> manifestListDS(Table table, Set<Long> snapshotIds) protected JobGroupInfonewJobGroupInfo(String groupId, String desc) protected TablenewStaticTable(TableMetadata metadata, FileIO io) options()protected org.apache.spark.sql.Dataset<FileInfo> otherMetadataFileDS(Table table) protected RewritePositionDeleteFilesSparkActionself()snapshotProperty(String property, String value) protected org.apache.spark.sql.SparkSessionspark()protected org.apache.spark.api.java.JavaSparkContextprotected org.apache.spark.sql.Dataset<FileInfo> statisticsFileDS(Table table, Set<Long> snapshotIds) protected <T> TwithJobGroupInfo(JobGroupInfo info, Supplier<T> supplier) Methods inherited from class java.lang.Objectclone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface org.apache.iceberg.actions.SnapshotUpdatesnapshotProperty
- 
Field Details- 
MANIFEST- See Also:
 
- 
MANIFEST_LIST- See Also:
 
- 
STATISTICS_FILES- See Also:
 
- 
OTHERS- See Also:
 
- 
FILE_PATH- See Also:
 
- 
LAST_MODIFIED- See Also:
 
- 
COMMA_SPLITTERprotected static final org.apache.iceberg.relocated.com.google.common.base.Splitter COMMA_SPLITTER
- 
COMMA_JOINERprotected static final org.apache.iceberg.relocated.com.google.common.base.Joiner COMMA_JOINER
 
- 
- 
Method Details- 
self
- 
filterDescription copied from interface:RewritePositionDeleteFilesA filter for finding deletes to rewrite.The filter will be converted to a partition filter with an inclusive projection. Any file that may contain rows matching this filter will be used by the action. The matching delete files will be rewritten. - Specified by:
- filterin interface- RewritePositionDeleteFiles
- Parameters:
- expression- An iceberg expression used to find deletes.
- Returns:
- this for method chaining
 
- 
executeDescription copied from interface:ActionExecutes this action.- Specified by:
- executein interface- Action<RewritePositionDeleteFiles,- RewritePositionDeleteFiles.Result> 
- Returns:
- the result of this action
 
- 
snapshotProperty
- 
commit
- 
commitSummary
- 
sparkprotected org.apache.spark.sql.SparkSession spark()
- 
sparkContextprotected org.apache.spark.api.java.JavaSparkContext sparkContext()
- 
option
- 
options
- 
options
- 
withJobGroupInfo
- 
newJobGroupInfo
- 
newStaticTable
- 
contentFileDS
- 
contentFileDS
- 
manifestDS
- 
manifestDS
- 
manifestListDS
- 
manifestListDS
- 
statisticsFileDS
- 
otherMetadataFileDS
- 
allReachableOtherMetadataFileDS
- 
loadMetadataTableprotected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> loadMetadataTable(Table table, MetadataTableType type) 
- 
deleteFilesprotected org.apache.iceberg.spark.actions.BaseSparkAction.DeleteSummary deleteFiles(ExecutorService executorService, Consumer<String> deleteFunc, Iterator<FileInfo> files) Deletes files and keeps track of how many files were removed for each file type.- Parameters:
- executorService- an executor service to use for parallel deletes
- deleteFunc- a delete func
- files- an iterator of Spark rows of the structure (path: String, type: String)
- Returns:
- stats on which files were deleted
 
- 
deleteFilesprotected org.apache.iceberg.spark.actions.BaseSparkAction.DeleteSummary deleteFiles(SupportsBulkOperations io, Iterator<FileInfo> files) 
 
-