java.lang.Object

org.apache.iceberg.spark.Spark3Util

public class Spark3Util extends Object

Nested Class Summary

Nested Classes

Modifier and Type

Class

Description

static class

Spark3Util.CatalogAndIdentifier

This mimics a class inside of Spark which is private inside of LookupCatalog.

static class

Spark3Util.DescribeSchemaVisitor
Method Summary

Modifier and Type

Method

Description

static UpdateProperties

applyPropertyChanges(UpdateProperties pendingUpdate, List<org.apache.spark.sql.connector.catalog.TableChange> changes)

Applies a list of Spark table changes to an UpdateProperties operation.

static UpdateSchema

applySchemaChanges(UpdateSchema pendingUpdate, List<org.apache.spark.sql.connector.catalog.TableChange> changes)

Applies a list of Spark table changes to an UpdateSchema operation.

static Spark3Util.CatalogAndIdentifier

catalogAndIdentifier(String description, org.apache.spark.sql.SparkSession spark, String name)

static Spark3Util.CatalogAndIdentifier

catalogAndIdentifier(String description, org.apache.spark.sql.SparkSession spark, String name, org.apache.spark.sql.connector.catalog.CatalogPlugin defaultCatalog)

static Spark3Util.CatalogAndIdentifier

catalogAndIdentifier(org.apache.spark.sql.SparkSession spark, String name)

static Spark3Util.CatalogAndIdentifier

catalogAndIdentifier(org.apache.spark.sql.SparkSession spark, String name, org.apache.spark.sql.connector.catalog.CatalogPlugin defaultCatalog)

static Spark3Util.CatalogAndIdentifier

catalogAndIdentifier(org.apache.spark.sql.SparkSession spark, List<String> nameParts)

static Spark3Util.CatalogAndIdentifier

catalogAndIdentifier(org.apache.spark.sql.SparkSession spark, List<String> nameParts, org.apache.spark.sql.connector.catalog.CatalogPlugin defaultCatalog)

A modified version of Spark's LookupCatalog.CatalogAndIdentifier.unapply Attempts to find the catalog and identifier a multipart identifier represents

static String

describe(List<Expression> exprs)

static String

describe(Expression expr)

static String

describe(Schema schema)

static String

describe(SortOrder order)

static String

describe(Type type)

static boolean

extensionsEnabled(org.apache.spark.sql.SparkSession spark)

static List<SparkTableUtil.SparkPartition>

getPartitions(org.apache.spark.sql.SparkSession spark, org.apache.hadoop.fs.Path rootPath, String format, Map<String,String> partitionFilter, PartitionSpec partitionSpec)

Use Spark to list all partitions in the table.

static TableIdentifier

identifierToTableIdentifier(org.apache.spark.sql.connector.catalog.Identifier identifier)

static Catalog

loadIcebergCatalog(org.apache.spark.sql.SparkSession spark, String catalogName)

Returns the underlying Iceberg Catalog object represented by a Spark Catalog

static Table

loadIcebergTable(org.apache.spark.sql.SparkSession spark, String name)

Returns an Iceberg Table by its name from a Spark V2 Catalog.

static String

quotedFullIdentifier(String catalogName, org.apache.spark.sql.connector.catalog.Identifier identifier)

static Map<String,String>

rebuildCreateProperties(Map<String,String> createProperties)

static org.apache.spark.sql.util.CaseInsensitiveStringMap

setOption(String key, String value, org.apache.spark.sql.util.CaseInsensitiveStringMap options)

static Table

toIcebergTable(org.apache.spark.sql.connector.catalog.Table table)

static Term

toIcebergTerm(org.apache.spark.sql.connector.expressions.Expression expr)

static org.apache.spark.sql.connector.expressions.NamedReference

toNamedReference(String name)

static org.apache.spark.sql.connector.expressions.SortOrder[]

toOrdering(SortOrder sortOrder)

static PartitionSpec

toPartitionSpec(Schema schema, org.apache.spark.sql.connector.expressions.Transform[] partitioning)

Converts Spark transforms into a PartitionSpec.

static org.apache.spark.sql.connector.expressions.Transform[]

toTransforms(PartitionSpec spec)

Converts a PartitionSpec to Spark transforms.

static org.apache.spark.sql.connector.expressions.Transform[]

toTransforms(Schema schema, List<PartitionField> fields)

static org.apache.spark.sql.catalyst.TableIdentifier

toV1TableIdentifier(org.apache.spark.sql.connector.catalog.Identifier identifier)

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Method Details
- setOption
  
  public static org.apache.spark.sql.util.CaseInsensitiveStringMap setOption(String key, String value, org.apache.spark.sql.util.CaseInsensitiveStringMap options)
- rebuildCreateProperties
  
  public static Map<String,String> rebuildCreateProperties(Map<String,String> createProperties)
- applyPropertyChanges
  
  public static UpdateProperties applyPropertyChanges(UpdateProperties pendingUpdate, List<org.apache.spark.sql.connector.catalog.TableChange> changes)
  
  Applies a list of Spark table changes to an UpdateProperties operation.
  
  Parameters:
  
  pendingUpdate - an uncommitted UpdateProperties operation to configure
  
  changes - a list of Spark table changes
  
  Returns:
  
  the UpdateProperties operation configured with the changes
- applySchemaChanges
  
  public static UpdateSchema applySchemaChanges(UpdateSchema pendingUpdate, List<org.apache.spark.sql.connector.catalog.TableChange> changes)
  
  Applies a list of Spark table changes to an UpdateSchema operation.
  
  Parameters:
  
  pendingUpdate - an uncommitted UpdateSchema operation to configure
  
  changes - a list of Spark table changes
  
  Returns:
  
  the UpdateSchema operation configured with the changes
- toIcebergTable
  
  public static Table toIcebergTable(org.apache.spark.sql.connector.catalog.Table table)
- toOrdering
  
  public static org.apache.spark.sql.connector.expressions.SortOrder[] toOrdering(SortOrder sortOrder)
- toTransforms
  
  public static org.apache.spark.sql.connector.expressions.Transform[] toTransforms(Schema schema, List<PartitionField> fields)
- toTransforms
  
  public static org.apache.spark.sql.connector.expressions.Transform[] toTransforms(PartitionSpec spec)
  
  Converts a PartitionSpec to Spark transforms.
  
  Parameters:
  
  spec - a PartitionSpec
  
  Returns:
  
  an array of Transforms
- toNamedReference
  
  public static org.apache.spark.sql.connector.expressions.NamedReference toNamedReference(String name)
- toIcebergTerm
  
  public static Term toIcebergTerm(org.apache.spark.sql.connector.expressions.Expression expr)
- toPartitionSpec
  
  public static PartitionSpec toPartitionSpec(Schema schema, org.apache.spark.sql.connector.expressions.Transform[] partitioning)
  
  Converts Spark transforms into a PartitionSpec.
  
  Parameters:
  
  schema - the table schema
  
  partitioning - Spark Transforms
  
  Returns:
  
  a PartitionSpec
- describe
  
  public static String describe(List<Expression> exprs)
- describe
  
  public static String describe(Expression expr)
- describe
  
  public static String describe(Schema schema)
- describe
  
  public static String describe(Type type)
- describe
  
  public static String describe(SortOrder order)
- extensionsEnabled
  
  public static boolean extensionsEnabled(org.apache.spark.sql.SparkSession spark)
- loadIcebergTable
  
  public static Table loadIcebergTable(org.apache.spark.sql.SparkSession spark, String name) throws org.apache.spark.sql.catalyst.parser.ParseException, org.apache.spark.sql.catalyst.analysis.NoSuchTableException
  
  Returns an Iceberg Table by its name from a Spark V2 Catalog. If cache is enabled in SparkCatalog, the TableOperations of the table may be stale, please refresh the table to get the latest one.
  
  Parameters:
  
  spark - SparkSession used for looking up catalog references and tables
  
  name - The multipart identifier of the Iceberg table
  
  Returns:
  
  an Iceberg table
  
  Throws:
  
  org.apache.spark.sql.catalyst.parser.ParseException
  
  org.apache.spark.sql.catalyst.analysis.NoSuchTableException
- loadIcebergCatalog
  
  public static Catalog loadIcebergCatalog(org.apache.spark.sql.SparkSession spark, String catalogName)
  
  Returns the underlying Iceberg Catalog object represented by a Spark Catalog
  
  Parameters:
  
  spark - SparkSession used for looking up catalog reference
  
  catalogName - The name of the Spark Catalog being referenced
  
  Returns:
  
  the Iceberg catalog class being wrapped by the Spark Catalog
- catalogAndIdentifier
  
  public static Spark3Util.CatalogAndIdentifier catalogAndIdentifier(org.apache.spark.sql.SparkSession spark, String name) throws org.apache.spark.sql.catalyst.parser.ParseException
  
  Throws:
  
  org.apache.spark.sql.catalyst.parser.ParseException
- catalogAndIdentifier
  
  public static Spark3Util.CatalogAndIdentifier catalogAndIdentifier(org.apache.spark.sql.SparkSession spark, String name, org.apache.spark.sql.connector.catalog.CatalogPlugin defaultCatalog) throws org.apache.spark.sql.catalyst.parser.ParseException
  
  Throws:
  
  org.apache.spark.sql.catalyst.parser.ParseException
- catalogAndIdentifier
  
  public static Spark3Util.CatalogAndIdentifier catalogAndIdentifier(String description, org.apache.spark.sql.SparkSession spark, String name)
- catalogAndIdentifier
  
  public static Spark3Util.CatalogAndIdentifier catalogAndIdentifier(String description, org.apache.spark.sql.SparkSession spark, String name, org.apache.spark.sql.connector.catalog.CatalogPlugin defaultCatalog)
- catalogAndIdentifier
  
  public static Spark3Util.CatalogAndIdentifier catalogAndIdentifier(org.apache.spark.sql.SparkSession spark, List<String> nameParts)
- catalogAndIdentifier
  
  public static Spark3Util.CatalogAndIdentifier catalogAndIdentifier(org.apache.spark.sql.SparkSession spark, List<String> nameParts, org.apache.spark.sql.connector.catalog.CatalogPlugin defaultCatalog)
  
  A modified version of Spark's LookupCatalog.CatalogAndIdentifier.unapply Attempts to find the catalog and identifier a multipart identifier represents
  
  Parameters:
  
  spark - Spark session to use for resolution
  
  nameParts - Multipart identifier representing a table
  
  defaultCatalog - Catalog to use if none is specified
  
  Returns:
  
  The CatalogPlugin and Identifier for the table
- identifierToTableIdentifier
  
  public static TableIdentifier identifierToTableIdentifier(org.apache.spark.sql.connector.catalog.Identifier identifier)
- quotedFullIdentifier
  
  public static String quotedFullIdentifier(String catalogName, org.apache.spark.sql.connector.catalog.Identifier identifier)
- getPartitions
  
  public static List<SparkTableUtil.SparkPartition> getPartitions(org.apache.spark.sql.SparkSession spark, org.apache.hadoop.fs.Path rootPath, String format, Map<String,String> partitionFilter, PartitionSpec partitionSpec)
  
  Use Spark to list all partitions in the table.
  
  Parameters:
  
  spark - a Spark session
  
  rootPath - a table identifier
  
  format - format of the file
  
  partitionFilter - partitionFilter of the file
  
  partitionSpec - partitionSpec of the table
  
  Returns:
  
  all table's partitions
- toV1TableIdentifier
  
  public static org.apache.spark.sql.catalyst.TableIdentifier toV1TableIdentifier(org.apache.spark.sql.connector.catalog.Identifier identifier)

Class Spark3Util

Nested Class Summary

Method Summary

Methods inherited from class java.lang.Object

Method Details

setOption

rebuildCreateProperties

applyPropertyChanges

applySchemaChanges

toIcebergTable

toOrdering

toTransforms

toTransforms

toNamedReference

toIcebergTerm

toPartitionSpec

describe

describe

describe

describe

describe

extensionsEnabled

loadIcebergTable

loadIcebergCatalog

catalogAndIdentifier

catalogAndIdentifier

catalogAndIdentifier

catalogAndIdentifier

catalogAndIdentifier

catalogAndIdentifier

identifierToTableIdentifier

quotedFullIdentifier

getPartitions

toV1TableIdentifier