Package org.apache.iceberg.spark
Class Spark3Util
- java.lang.Object
- 
- org.apache.iceberg.spark.Spark3Util
 
- 
 public class Spark3Util extends java.lang.Object
- 
- 
Nested Class SummaryNested Classes Modifier and Type Class Description static classSpark3Util.CatalogAndIdentifierThis mimics a class inside of Spark which is private inside of LookupCatalog.static classSpark3Util.DescribeSchemaVisitor
 - 
Method SummaryAll Methods Static Methods Concrete Methods Modifier and Type Method Description static UpdatePropertiesapplyPropertyChanges(UpdateProperties pendingUpdate, java.util.List<org.apache.spark.sql.connector.catalog.TableChange> changes)Applies a list of Spark table changes to anUpdatePropertiesoperation.static UpdateSchemaapplySchemaChanges(UpdateSchema pendingUpdate, java.util.List<org.apache.spark.sql.connector.catalog.TableChange> changes)Applies a list of Spark table changes to anUpdateSchemaoperation.static Spark3Util.CatalogAndIdentifiercatalogAndIdentifier(java.lang.String description, org.apache.spark.sql.SparkSession spark, java.lang.String name)static Spark3Util.CatalogAndIdentifiercatalogAndIdentifier(java.lang.String description, org.apache.spark.sql.SparkSession spark, java.lang.String name, org.apache.spark.sql.connector.catalog.CatalogPlugin defaultCatalog)static Spark3Util.CatalogAndIdentifiercatalogAndIdentifier(org.apache.spark.sql.SparkSession spark, java.lang.String name)static Spark3Util.CatalogAndIdentifiercatalogAndIdentifier(org.apache.spark.sql.SparkSession spark, java.lang.String name, org.apache.spark.sql.connector.catalog.CatalogPlugin defaultCatalog)static Spark3Util.CatalogAndIdentifiercatalogAndIdentifier(org.apache.spark.sql.SparkSession spark, java.util.List<java.lang.String> nameParts)static Spark3Util.CatalogAndIdentifiercatalogAndIdentifier(org.apache.spark.sql.SparkSession spark, java.util.List<java.lang.String> nameParts, org.apache.spark.sql.connector.catalog.CatalogPlugin defaultCatalog)A modified version of Spark's LookupCatalog.CatalogAndIdentifier.unapply Attempts to find the catalog and identifier a multipart identifier representsstatic java.lang.Stringdescribe(Expression expr)static java.lang.Stringdescribe(Schema schema)static java.lang.Stringdescribe(SortOrder order)static java.lang.Stringdescribe(Type type)static booleanextensionsEnabled(org.apache.spark.sql.SparkSession spark)static java.util.List<SparkTableUtil.SparkPartition>getPartitions(org.apache.spark.sql.SparkSession spark, org.apache.hadoop.fs.Path rootPath, java.lang.String format, java.util.Map<java.lang.String,java.lang.String> partitionFilter)Use Spark to list all partitions in the table.static TableIdentifieridentifierToTableIdentifier(org.apache.spark.sql.connector.catalog.Identifier identifier)static CatalogloadIcebergCatalog(org.apache.spark.sql.SparkSession spark, java.lang.String catalogName)Returns the underlying Iceberg Catalog object represented by a Spark Catalogstatic TableloadIcebergTable(org.apache.spark.sql.SparkSession spark, java.lang.String name)Returns an Iceberg Table by its name from a Spark V2 Catalog.static java.lang.StringquotedFullIdentifier(java.lang.String catalogName, org.apache.spark.sql.connector.catalog.Identifier identifier)static java.util.Map<java.lang.String,java.lang.String>rebuildCreateProperties(java.util.Map<java.lang.String,java.lang.String> createProperties)static org.apache.spark.sql.util.CaseInsensitiveStringMapsetOption(java.lang.String key, java.lang.String value, org.apache.spark.sql.util.CaseInsensitiveStringMap options)static TabletoIcebergTable(org.apache.spark.sql.connector.catalog.Table table)static TermtoIcebergTerm(org.apache.spark.sql.connector.expressions.Expression expr)static org.apache.spark.sql.connector.expressions.NamedReferencetoNamedReference(java.lang.String name)static PartitionSpectoPartitionSpec(Schema schema, org.apache.spark.sql.connector.expressions.Transform[] partitioning)Converts Spark transforms into aPartitionSpec.static org.apache.spark.sql.connector.expressions.Transform[]toTransforms(PartitionSpec spec)Converts a PartitionSpec to Spark transforms.static org.apache.spark.sql.catalyst.TableIdentifiertoV1TableIdentifier(org.apache.spark.sql.connector.catalog.Identifier identifier)
 
- 
- 
- 
Method Detail- 
setOptionpublic static org.apache.spark.sql.util.CaseInsensitiveStringMap setOption(java.lang.String key, java.lang.String value, org.apache.spark.sql.util.CaseInsensitiveStringMap options)
 - 
rebuildCreatePropertiespublic static java.util.Map<java.lang.String,java.lang.String> rebuildCreateProperties(java.util.Map<java.lang.String,java.lang.String> createProperties) 
 - 
applyPropertyChangespublic static UpdateProperties applyPropertyChanges(UpdateProperties pendingUpdate, java.util.List<org.apache.spark.sql.connector.catalog.TableChange> changes) Applies a list of Spark table changes to anUpdatePropertiesoperation.- Parameters:
- pendingUpdate- an uncommitted UpdateProperties operation to configure
- changes- a list of Spark table changes
- Returns:
- the UpdateProperties operation configured with the changes
 
 - 
applySchemaChangespublic static UpdateSchema applySchemaChanges(UpdateSchema pendingUpdate, java.util.List<org.apache.spark.sql.connector.catalog.TableChange> changes) Applies a list of Spark table changes to anUpdateSchemaoperation.- Parameters:
- pendingUpdate- an uncommitted UpdateSchema operation to configure
- changes- a list of Spark table changes
- Returns:
- the UpdateSchema operation configured with the changes
 
 - 
toIcebergTablepublic static Table toIcebergTable(org.apache.spark.sql.connector.catalog.Table table) 
 - 
toTransformspublic static org.apache.spark.sql.connector.expressions.Transform[] toTransforms(PartitionSpec spec) Converts a PartitionSpec to Spark transforms.- Parameters:
- spec- a PartitionSpec
- Returns:
- an array of Transforms
 
 - 
toNamedReferencepublic static org.apache.spark.sql.connector.expressions.NamedReference toNamedReference(java.lang.String name) 
 - 
toIcebergTermpublic static Term toIcebergTerm(org.apache.spark.sql.connector.expressions.Expression expr) 
 - 
toPartitionSpecpublic static PartitionSpec toPartitionSpec(Schema schema, org.apache.spark.sql.connector.expressions.Transform[] partitioning) Converts Spark transforms into aPartitionSpec.- Parameters:
- schema- the table schema
- partitioning- Spark Transforms
- Returns:
- a PartitionSpec
 
 - 
describepublic static java.lang.String describe(Expression expr) 
 - 
describepublic static java.lang.String describe(Schema schema) 
 - 
describepublic static java.lang.String describe(Type type) 
 - 
describepublic static java.lang.String describe(SortOrder order) 
 - 
extensionsEnabledpublic static boolean extensionsEnabled(org.apache.spark.sql.SparkSession spark) 
 - 
loadIcebergTablepublic static Table loadIcebergTable(org.apache.spark.sql.SparkSession spark, java.lang.String name) throws org.apache.spark.sql.catalyst.parser.ParseException, org.apache.spark.sql.catalyst.analysis.NoSuchTableException Returns an Iceberg Table by its name from a Spark V2 Catalog. If cache is enabled inSparkCatalog, theTableOperationsof the table may be stale, please refresh the table to get the latest one.- Parameters:
- spark- SparkSession used for looking up catalog references and tables
- name- The multipart identifier of the Iceberg table
- Returns:
- an Iceberg table
- Throws:
- org.apache.spark.sql.catalyst.parser.ParseException
- org.apache.spark.sql.catalyst.analysis.NoSuchTableException
 
 - 
loadIcebergCatalogpublic static Catalog loadIcebergCatalog(org.apache.spark.sql.SparkSession spark, java.lang.String catalogName) Returns the underlying Iceberg Catalog object represented by a Spark Catalog- Parameters:
- spark- SparkSession used for looking up catalog reference
- catalogName- The name of the Spark Catalog being referenced
- Returns:
- the Iceberg catalog class being wrapped by the Spark Catalog
 
 - 
catalogAndIdentifierpublic static Spark3Util.CatalogAndIdentifier catalogAndIdentifier(org.apache.spark.sql.SparkSession spark, java.lang.String name) throws org.apache.spark.sql.catalyst.parser.ParseException - Throws:
- org.apache.spark.sql.catalyst.parser.ParseException
 
 - 
catalogAndIdentifierpublic static Spark3Util.CatalogAndIdentifier catalogAndIdentifier(org.apache.spark.sql.SparkSession spark, java.lang.String name, org.apache.spark.sql.connector.catalog.CatalogPlugin defaultCatalog) throws org.apache.spark.sql.catalyst.parser.ParseException - Throws:
- org.apache.spark.sql.catalyst.parser.ParseException
 
 - 
catalogAndIdentifierpublic static Spark3Util.CatalogAndIdentifier catalogAndIdentifier(java.lang.String description, org.apache.spark.sql.SparkSession spark, java.lang.String name) 
 - 
catalogAndIdentifierpublic static Spark3Util.CatalogAndIdentifier catalogAndIdentifier(java.lang.String description, org.apache.spark.sql.SparkSession spark, java.lang.String name, org.apache.spark.sql.connector.catalog.CatalogPlugin defaultCatalog) 
 - 
catalogAndIdentifierpublic static Spark3Util.CatalogAndIdentifier catalogAndIdentifier(org.apache.spark.sql.SparkSession spark, java.util.List<java.lang.String> nameParts) 
 - 
catalogAndIdentifierpublic static Spark3Util.CatalogAndIdentifier catalogAndIdentifier(org.apache.spark.sql.SparkSession spark, java.util.List<java.lang.String> nameParts, org.apache.spark.sql.connector.catalog.CatalogPlugin defaultCatalog) A modified version of Spark's LookupCatalog.CatalogAndIdentifier.unapply Attempts to find the catalog and identifier a multipart identifier represents- Parameters:
- spark- Spark session to use for resolution
- nameParts- Multipart identifier representing a table
- defaultCatalog- Catalog to use if none is specified
- Returns:
- The CatalogPlugin and Identifier for the table
 
 - 
identifierToTableIdentifierpublic static TableIdentifier identifierToTableIdentifier(org.apache.spark.sql.connector.catalog.Identifier identifier) 
 - 
quotedFullIdentifierpublic static java.lang.String quotedFullIdentifier(java.lang.String catalogName, org.apache.spark.sql.connector.catalog.Identifier identifier)
 - 
getPartitionspublic static java.util.List<SparkTableUtil.SparkPartition> getPartitions(org.apache.spark.sql.SparkSession spark, org.apache.hadoop.fs.Path rootPath, java.lang.String format, java.util.Map<java.lang.String,java.lang.String> partitionFilter) Use Spark to list all partitions in the table.- Parameters:
- spark- a Spark session
- rootPath- a table identifier
- format- format of the file
- partitionFilter- partitionFilter of the file
- Returns:
- all table's partitions
 
 - 
toV1TableIdentifierpublic static org.apache.spark.sql.catalyst.TableIdentifier toV1TableIdentifier(org.apache.spark.sql.connector.catalog.Identifier identifier) 
 
- 
 
-