java.lang.Object
- org.apache.iceberg.spark.Spark3Util

public class Spark3Util
extends java.lang.Object

Nested Class Summary

Nested Classes
Modifier and Type	Class	Description
`static class`	`Spark3Util.CatalogAndIdentifier`	This mimics a class inside of Spark which is private inside of LookupCatalog.
`static class`	`Spark3Util.DescribeSchemaVisitor`

Method Summary

All Methods Static Methods Concrete Methods
Modifier and Type	Method	Description
`static UpdateProperties`	`applyPropertyChanges(UpdateProperties pendingUpdate, java.util.List<org.apache.spark.sql.connector.catalog.TableChange> changes)`	Applies a list of Spark table changes to an `UpdateProperties` operation.
`static UpdateSchema`	`applySchemaChanges(UpdateSchema pendingUpdate, java.util.List<org.apache.spark.sql.connector.catalog.TableChange> changes)`	Applies a list of Spark table changes to an `UpdateSchema` operation.
`static Spark3Util.CatalogAndIdentifier`	`catalogAndIdentifier(java.lang.String description, org.apache.spark.sql.SparkSession spark, java.lang.String name)`
`static Spark3Util.CatalogAndIdentifier`	`catalogAndIdentifier(java.lang.String description, org.apache.spark.sql.SparkSession spark, java.lang.String name, org.apache.spark.sql.connector.catalog.CatalogPlugin defaultCatalog)`
`static Spark3Util.CatalogAndIdentifier`	`catalogAndIdentifier(org.apache.spark.sql.SparkSession spark, java.lang.String name)`
`static Spark3Util.CatalogAndIdentifier`	`catalogAndIdentifier(org.apache.spark.sql.SparkSession spark, java.lang.String name, org.apache.spark.sql.connector.catalog.CatalogPlugin defaultCatalog)`
`static Spark3Util.CatalogAndIdentifier`	`catalogAndIdentifier(org.apache.spark.sql.SparkSession spark, java.util.List<java.lang.String> nameParts)`
`static Spark3Util.CatalogAndIdentifier`	`catalogAndIdentifier(org.apache.spark.sql.SparkSession spark, java.util.List<java.lang.String> nameParts, org.apache.spark.sql.connector.catalog.CatalogPlugin defaultCatalog)`	A modified version of Spark's LookupCatalog.CatalogAndIdentifier.unapply Attempts to find the catalog and identifier a multipart identifier represents
`static java.lang.String`	`describe(Expression expr)`
`static java.lang.String`	`describe(Schema schema)`
`static java.lang.String`	`describe(SortOrder order)`
`static java.lang.String`	`describe(Type type)`
`static boolean`	`extensionsEnabled(org.apache.spark.sql.SparkSession spark)`
`static java.util.List<SparkTableUtil.SparkPartition>`	`getPartitions(org.apache.spark.sql.SparkSession spark, org.apache.hadoop.fs.Path rootPath, java.lang.String format, java.util.Map<java.lang.String,java.lang.String> partitionFilter)`	Use Spark to list all partitions in the table.
`static TableIdentifier`	`identifierToTableIdentifier(org.apache.spark.sql.connector.catalog.Identifier identifier)`
`static Catalog`	`loadIcebergCatalog(org.apache.spark.sql.SparkSession spark, java.lang.String catalogName)`	Returns the underlying Iceberg Catalog object represented by a Spark Catalog
`static Table`	`loadIcebergTable(org.apache.spark.sql.SparkSession spark, java.lang.String name)`	Returns an Iceberg Table by its name from a Spark V2 Catalog.
`static java.lang.String`	`quotedFullIdentifier(java.lang.String catalogName, org.apache.spark.sql.connector.catalog.Identifier identifier)`
`static java.util.Map<java.lang.String,java.lang.String>`	`rebuildCreateProperties(java.util.Map<java.lang.String,java.lang.String> createProperties)`
`static org.apache.spark.sql.util.CaseInsensitiveStringMap`	`setOption(java.lang.String key, java.lang.String value, org.apache.spark.sql.util.CaseInsensitiveStringMap options)`
`static Table`	`toIcebergTable(org.apache.spark.sql.connector.catalog.Table table)`
`static Term`	`toIcebergTerm(org.apache.spark.sql.connector.expressions.Expression expr)`
`static org.apache.spark.sql.connector.expressions.NamedReference`	`toNamedReference(java.lang.String name)`
`static PartitionSpec`	`toPartitionSpec(Schema schema, org.apache.spark.sql.connector.expressions.Transform[] partitioning)`	Converts Spark transforms into a `PartitionSpec`.
`static org.apache.spark.sql.connector.expressions.Transform[]`	`toTransforms(PartitionSpec spec)`	Converts a PartitionSpec to Spark transforms.
`static org.apache.spark.sql.catalyst.TableIdentifier`	`toV1TableIdentifier(org.apache.spark.sql.connector.catalog.Identifier identifier)`

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Method Detail

setOption

public static org.apache.spark.sql.util.CaseInsensitiveStringMap setOption(java.lang.String key,
                                                                           java.lang.String value,
                                                                           org.apache.spark.sql.util.CaseInsensitiveStringMap options)

rebuildCreateProperties

public static java.util.Map<java.lang.String,java.lang.String> rebuildCreateProperties(java.util.Map<java.lang.String,java.lang.String> createProperties)

applyPropertyChanges
```
public static UpdateProperties applyPropertyChanges(UpdateProperties pendingUpdate,
                                                    java.util.List<org.apache.spark.sql.connector.catalog.TableChange> changes)
```
Applies a list of Spark table changes to an UpdateProperties operation.

Parameters:

pendingUpdate - an uncommitted UpdateProperties operation to configure

changes - a list of Spark table changes

Returns:

the UpdateProperties operation configured with the changes

applySchemaChanges
```
public static UpdateSchema applySchemaChanges(UpdateSchema pendingUpdate,
                                              java.util.List<org.apache.spark.sql.connector.catalog.TableChange> changes)
```
Applies a list of Spark table changes to an UpdateSchema operation.

Parameters:

pendingUpdate - an uncommitted UpdateSchema operation to configure

changes - a list of Spark table changes

Returns:

the UpdateSchema operation configured with the changes

toIcebergTable

public static Table toIcebergTable(org.apache.spark.sql.connector.catalog.Table table)

toTransforms
```
public static org.apache.spark.sql.connector.expressions.Transform[] toTransforms(PartitionSpec spec)
```
Converts a PartitionSpec to Spark transforms.

Parameters:

spec - a PartitionSpec

Returns:

an array of Transforms

toNamedReference

public static org.apache.spark.sql.connector.expressions.NamedReference toNamedReference(java.lang.String name)

toIcebergTerm

public static Term toIcebergTerm(org.apache.spark.sql.connector.expressions.Expression expr)

toPartitionSpec

public static PartitionSpec toPartitionSpec(Schema schema,
                                            org.apache.spark.sql.connector.expressions.Transform[] partitioning)

Converts Spark transforms into a PartitionSpec.

Parameters:: schema - the table schema; partitioning - Spark Transforms
Returns:: a PartitionSpec

describe

public static java.lang.String describe(Expression expr)

describe

public static java.lang.String describe(Schema schema)

describe

public static java.lang.String describe(Type type)

describe

public static java.lang.String describe(SortOrder order)

extensionsEnabled

public static boolean extensionsEnabled(org.apache.spark.sql.SparkSession spark)

loadIcebergTable
```
public static Table loadIcebergTable(org.apache.spark.sql.SparkSession spark,
                                     java.lang.String name)
                              throws org.apache.spark.sql.catalyst.parser.ParseException,
                                     org.apache.spark.sql.catalyst.analysis.NoSuchTableException
```
Returns an Iceberg Table by its name from a Spark V2 Catalog. If cache is enabled in SparkCatalog, the TableOperations of the table may be stale, please refresh the table to get the latest one.

Parameters:

spark - SparkSession used for looking up catalog references and tables

name - The multipart identifier of the Iceberg table

Returns:

an Iceberg table

Throws:

org.apache.spark.sql.catalyst.parser.ParseException

org.apache.spark.sql.catalyst.analysis.NoSuchTableException

loadIcebergCatalog
```
public static Catalog loadIcebergCatalog(org.apache.spark.sql.SparkSession spark,
                                         java.lang.String catalogName)
```
Returns the underlying Iceberg Catalog object represented by a Spark Catalog

Parameters:

spark - SparkSession used for looking up catalog reference

catalogName - The name of the Spark Catalog being referenced

Returns:

the Iceberg catalog class being wrapped by the Spark Catalog

catalogAndIdentifier

public static Spark3Util.CatalogAndIdentifier catalogAndIdentifier(org.apache.spark.sql.SparkSession spark,
                                                                   java.lang.String name)
                                                            throws org.apache.spark.sql.catalyst.parser.ParseException

Throws:: org.apache.spark.sql.catalyst.parser.ParseException

catalogAndIdentifier

public static Spark3Util.CatalogAndIdentifier catalogAndIdentifier(org.apache.spark.sql.SparkSession spark,
                                                                   java.lang.String name,
                                                                   org.apache.spark.sql.connector.catalog.CatalogPlugin defaultCatalog)
                                                            throws org.apache.spark.sql.catalyst.parser.ParseException

Throws:: org.apache.spark.sql.catalyst.parser.ParseException

catalogAndIdentifier

public static Spark3Util.CatalogAndIdentifier catalogAndIdentifier(java.lang.String description,
                                                                   org.apache.spark.sql.SparkSession spark,
                                                                   java.lang.String name)

catalogAndIdentifier

public static Spark3Util.CatalogAndIdentifier catalogAndIdentifier(java.lang.String description,
                                                                   org.apache.spark.sql.SparkSession spark,
                                                                   java.lang.String name,
                                                                   org.apache.spark.sql.connector.catalog.CatalogPlugin defaultCatalog)

catalogAndIdentifier

public static Spark3Util.CatalogAndIdentifier catalogAndIdentifier(org.apache.spark.sql.SparkSession spark,
                                                                   java.util.List<java.lang.String> nameParts)

catalogAndIdentifier
```
public static Spark3Util.CatalogAndIdentifier catalogAndIdentifier(org.apache.spark.sql.SparkSession spark,
                                                                   java.util.List<java.lang.String> nameParts,
                                                                   org.apache.spark.sql.connector.catalog.CatalogPlugin defaultCatalog)
```
A modified version of Spark's LookupCatalog.CatalogAndIdentifier.unapply Attempts to find the catalog and identifier a multipart identifier represents

Parameters:

spark - Spark session to use for resolution

nameParts - Multipart identifier representing a table

defaultCatalog - Catalog to use if none is specified

Returns:

The CatalogPlugin and Identifier for the table

identifierToTableIdentifier

public static TableIdentifier identifierToTableIdentifier(org.apache.spark.sql.connector.catalog.Identifier identifier)

quotedFullIdentifier

public static java.lang.String quotedFullIdentifier(java.lang.String catalogName,
                                                    org.apache.spark.sql.connector.catalog.Identifier identifier)

getPartitions

public static java.util.List<SparkTableUtil.SparkPartition> getPartitions(org.apache.spark.sql.SparkSession spark,
                                                                          org.apache.hadoop.fs.Path rootPath,
                                                                          java.lang.String format,
                                                                          java.util.Map<java.lang.String,java.lang.String> partitionFilter)

Use Spark to list all partitions in the table.

Parameters:: spark - a Spark session; rootPath - a table identifier; format - format of the file; partitionFilter - partitionFilter of the file
Returns:: all table's partitions

toV1TableIdentifier

public static org.apache.spark.sql.catalyst.TableIdentifier toV1TableIdentifier(org.apache.spark.sql.connector.catalog.Identifier identifier)

Class Spark3Util

Nested Class Summary

Method Summary

Methods inherited from class java.lang.Object

Method Detail

setOption

rebuildCreateProperties

applyPropertyChanges

applySchemaChanges

toIcebergTable

toTransforms

toNamedReference

toIcebergTerm

toPartitionSpec

describe

describe

describe

describe

extensionsEnabled

loadIcebergTable

loadIcebergCatalog

catalogAndIdentifier

catalogAndIdentifier

catalogAndIdentifier

catalogAndIdentifier

catalogAndIdentifier

catalogAndIdentifier

identifierToTableIdentifier

quotedFullIdentifier

getPartitions

toV1TableIdentifier