BaseRewriteDataFilesSpark3Action

java.lang.Object
- org.apache.iceberg.spark.actions.BaseRewriteDataFilesSpark3Action

All Implemented Interfaces:

Action<RewriteDataFiles,RewriteDataFiles.Result>, RewriteDataFiles, SnapshotUpdate<RewriteDataFiles,RewriteDataFiles.Result>
```
public class BaseRewriteDataFilesSpark3Action
extends java.lang.Object
```

Nested Class Summary
- Nested classes/interfaces inherited from interface org.apache.iceberg.actions.RewriteDataFiles
  RewriteDataFiles.FileGroupInfo, RewriteDataFiles.FileGroupRewriteResult, RewriteDataFiles.Result

Field Summary
- Fields inherited from interface org.apache.iceberg.actions.RewriteDataFiles
  MAX_CONCURRENT_FILE_GROUP_REWRITES, MAX_CONCURRENT_FILE_GROUP_REWRITES_DEFAULT, MAX_FILE_GROUP_SIZE_BYTES, MAX_FILE_GROUP_SIZE_BYTES_DEFAULT, PARTIAL_PROGRESS_ENABLED, PARTIAL_PROGRESS_ENABLED_DEFAULT, PARTIAL_PROGRESS_MAX_COMMITS, PARTIAL_PROGRESS_MAX_COMMITS_DEFAULT, TARGET_FILE_SIZE_BYTES, USE_STARTING_SEQUENCE_NUMBER, USE_STARTING_SEQUENCE_NUMBER_DEFAULT

Constructor Summary

Constructors
Modifier Constructor and Description

protected BaseRewriteDataFilesSpark3Action(org.apache.spark.sql.SparkSession spark, Table table)

Constructors
Modifier	Constructor and Description
`protected`	`BaseRewriteDataFilesSpark3Action(org.apache.spark.sql.SparkSession spark, Table table)`

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`RewriteDataFiles`	`binPack()` Choose BINPACK as a strategy for this rewrite operation
`protected BinPackStrategy`	`binPackStrategy()` The framework specific `BinPackStrategy`
`protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>`	`buildManifestFileDF(Table table)`
`protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>`	`buildManifestListDF(Table table)`
`protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>`	`buildOtherMetadataFileDF(Table table)`
`protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>`	`buildValidDataFileDF(Table table)`
`protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>`	`buildValidMetadataFileDF(Table table)`
`protected void`	`commit(SnapshotUpdate<?> update)`
`RewriteDataFiles.Result`	`execute()` Executes this action.
`RewriteDataFiles`	`filter(Expression expression)` A user provided filter for determining which files will be considered by the rewrite strategy.
`protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>`	`loadMetadataTable(Table table, MetadataTableType type)`
`protected JobGroupInfo`	`newJobGroupInfo(java.lang.String groupId, java.lang.String desc)`
`protected Table`	`newStaticTable(TableMetadata metadata, FileIO io)`
`ThisT`	`option(java.lang.String name, java.lang.String value)` Configures this action with an extra option.
`protected java.util.Map<java.lang.String,java.lang.String>`	`options()`
`ThisT`	`options(java.util.Map<java.lang.String,java.lang.String> newOptions)` Configures this action with extra options.
`protected RewriteDataFiles`	`self()`
`ThisT`	`snapshotProperty(java.lang.String property, java.lang.String value)` Sets a summary property in the snapshot produced by this action.
`RewriteDataFiles`	`sort()` Choose SORT as a strategy for this rewrite operation using the table's sortOrder
`RewriteDataFiles`	`sort(SortOrder sortOrder)` Choose SORT as a strategy for this rewrite operation and manually specify the sortOrder to use
`protected SortStrategy`	`sortStrategy()` The framework specific `SortStrategy`
`protected org.apache.spark.sql.SparkSession`	`spark()`
`protected org.apache.spark.api.java.JavaSparkContext`	`sparkContext()`
`protected Table`	`table()`
`protected <T> T`	`withJobGroupInfo(JobGroupInfo info, java.util.function.Supplier<T> supplier)`

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Methods inherited from interface org.apache.iceberg.actions.SnapshotUpdate
snapshotProperty

Methods inherited from interface org.apache.iceberg.actions.Action
option, options

- Constructor Detail
  - BaseRewriteDataFilesSpark3Action
```
protected BaseRewriteDataFilesSpark3Action(org.apache.spark.sql.SparkSession spark,
                                           Table table)
```
- Method Detail
  - binPackStrategy
```
protected BinPackStrategy binPackStrategy()
```
    The framework specific BinPackStrategy
  - sortStrategy
```
protected SortStrategy sortStrategy()
```
    The framework specific SortStrategy
  - self
```
protected RewriteDataFiles self()
```
  - table
```
protected Table table()
```
  - binPack
```
public RewriteDataFiles binPack()
```
    Description copied from interface: RewriteDataFiles
    
    Choose BINPACK as a strategy for this rewrite operation
    
    Specified by:
    
    binPack in interface RewriteDataFiles
    
    Returns:
    
    this for method chaining
  - sort
```
public RewriteDataFiles sort(SortOrder sortOrder)
```
    Description copied from interface: RewriteDataFiles
    
    Choose SORT as a strategy for this rewrite operation and manually specify the sortOrder to use
    
    Specified by:
    
    sort in interface RewriteDataFiles
    
    Parameters:
    
    sortOrder - user defined sortOrder
    
    Returns:
    
    this for method chaining
  - sort
```
public RewriteDataFiles sort()
```
    Description copied from interface: RewriteDataFiles
    
    Choose SORT as a strategy for this rewrite operation using the table's sortOrder
    
    Specified by:
    
    sort in interface RewriteDataFiles
    
    Returns:
    
    this for method chaining
  - filter
```
public RewriteDataFiles filter(Expression expression)
```
    Description copied from interface: RewriteDataFiles
    
    A user provided filter for determining which files will be considered by the rewrite strategy. This will be used in addition to whatever rules the rewrite strategy generates. For example this would be used for providing a restriction to only run rewrite on a specific partition.
    
    Specified by:
    
    filter in interface RewriteDataFiles
    
    Parameters:
    
    expression - An iceberg expression used to determine which files will be considered for rewriting
    
    Returns:
    
    this for chaining
  - execute
```
public RewriteDataFiles.Result execute()
```
    Description copied from interface: Action
    
    Executes this action.
    
    Specified by:
    
    execute in interface Action<RewriteDataFiles,RewriteDataFiles.Result>
    
    Returns:
    
    the result of this action
  - snapshotProperty
```
public ThisT snapshotProperty(java.lang.String property,
                              java.lang.String value)
```
    Description copied from interface: SnapshotUpdate
    
    Sets a summary property in the snapshot produced by this action.
    
    Specified by:
    
    snapshotProperty in interface SnapshotUpdate<ThisT,R>
    
    Parameters:
    
    property - a snapshot property name
    
    value - a snapshot property value
    
    Returns:
    
    this for method chaining
  - commit
```
protected void commit(SnapshotUpdate<?> update)
```
  - spark
```
protected org.apache.spark.sql.SparkSession spark()
```
  - sparkContext
```
protected org.apache.spark.api.java.JavaSparkContext sparkContext()
```
  - option
```
public ThisT option(java.lang.String name,
                    java.lang.String value)
```
    Description copied from interface: Action
    
    Configures this action with an extra option.
    Certain actions allow users to control internal details of their execution via options.
    
    Specified by:
    
    option in interface Action<ThisT,R>
    
    Parameters:
    
    name - an option name
    
    value - an option value
    
    Returns:
    
    this for method chaining
  - options
```
public ThisT options(java.util.Map<java.lang.String,java.lang.String> newOptions)
```
    Description copied from interface: Action
    
    Configures this action with extra options.
    Certain actions allow users to control internal details of their execution via options.
    
    Specified by:
    
    options in interface Action<ThisT,R>
    
    Parameters:
    
    newOptions - a map of extra options
    
    Returns:
    
    this for method chaining
  - options
```
protected java.util.Map<java.lang.String,java.lang.String> options()
```
  - withJobGroupInfo
```
protected <T> T withJobGroupInfo(JobGroupInfo info,
                                 java.util.function.Supplier<T> supplier)
```
  - newJobGroupInfo
```
protected JobGroupInfo newJobGroupInfo(java.lang.String groupId,
                                       java.lang.String desc)
```
  - newStaticTable
```
protected Table newStaticTable(TableMetadata metadata,
                               FileIO io)
```
  - buildValidDataFileDF
```
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> buildValidDataFileDF(Table table)
```
  - buildManifestFileDF
```
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> buildManifestFileDF(Table table)
```
  - buildManifestListDF
```
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> buildManifestListDF(Table table)
```
  - buildOtherMetadataFileDF
```
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> buildOtherMetadataFileDF(Table table)
```
  - buildValidMetadataFileDF
```
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> buildValidMetadataFileDF(Table table)
```
  - loadMetadataTable
```
protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> loadMetadataTable(Table table,
                                                                                   MetadataTableType type)
```

Class BaseRewriteDataFilesSpark3Action

Nested Class Summary

Nested classes/interfaces inherited from interface org.apache.iceberg.actions.RewriteDataFiles

Field Summary

Fields inherited from interface org.apache.iceberg.actions.RewriteDataFiles

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Methods inherited from interface org.apache.iceberg.actions.SnapshotUpdate

Methods inherited from interface org.apache.iceberg.actions.Action

Constructor Detail

BaseRewriteDataFilesSpark3Action

Method Detail

binPackStrategy

sortStrategy

self

table

binPack

sort

sort

filter

execute

snapshotProperty

commit

spark

sparkContext

option

options

options

withJobGroupInfo

newJobGroupInfo

newStaticTable

buildValidDataFileDF

buildManifestFileDF

buildManifestListDF

buildOtherMetadataFileDF

buildValidMetadataFileDF

loadMetadataTable