Class SparkUtil

java.lang.Object
org.apache.iceberg.spark.SparkUtil

public class SparkUtil extends Object
  • Method Details

    • validatePartitionTransforms

      public static void validatePartitionTransforms(PartitionSpec spec)
      Check whether the partition transforms in a spec can be used to write data.
      Parameters:
      spec - a PartitionSpec
      Throws:
      UnsupportedOperationException - if the spec contains unknown partition transforms
    • catalogAndIdentifier

      public static <C, T> Pair<C,T> catalogAndIdentifier(List<String> nameParts, Function<String,C> catalogProvider, BiFunction<String[],String,T> identiferProvider, C currentCatalog, String[] currentNamespace)
      A modified version of Spark's LookupCatalog.CatalogAndIdentifier.unapply Attempts to find the catalog and identifier a multipart identifier represents
      Parameters:
      nameParts - Multipart identifier representing a table
      Returns:
      The CatalogPlugin and Identifier for the table
    • hadoopConfCatalogOverrides

      public static org.apache.hadoop.conf.Configuration hadoopConfCatalogOverrides(org.apache.spark.sql.SparkSession spark, String catalogName)
      Pulls any Catalog specific overrides for the Hadoop conf from the current SparkSession, which can be set via `spark.sql.catalog.$catalogName.hadoop.*`

      Mirrors the override of hadoop configurations for a given spark session using `spark.hadoop.*`.

      The SparkCatalog allows for hadoop configurations to be overridden per catalog, by setting them on the SQLConf, where the following will add the property "fs.default.name" with value "hdfs://hanksnamenode:8020" to the catalog's hadoop configuration. SparkSession.builder() .config(s"spark.sql.catalog.$catalogName.hadoop.fs.default.name", "hdfs://hanksnamenode:8020") .getOrCreate()

      Parameters:
      spark - The current Spark session
      catalogName - Name of the catalog to find overrides for.
      Returns:
      the Hadoop Configuration that should be used for this catalog, with catalog specific overrides applied.
    • partitionMapToExpression

      public static List<org.apache.spark.sql.catalyst.expressions.Expression> partitionMapToExpression(org.apache.spark.sql.types.StructType schema, Map<String,String> filters)
      Get a List of Spark filter Expression.
      Parameters:
      schema - table schema
      filters - filters in the format of a Map, where key is one of the table column name, and value is the specific value to be filtered on the column.
      Returns:
      a List of filters in the format of Spark Expression.
    • toColumnName

      public static String toColumnName(org.apache.spark.sql.connector.expressions.NamedReference ref)
    • caseSensitive

      public static boolean caseSensitive(org.apache.spark.sql.SparkSession spark)
    • executorLocations

      public static List<String> executorLocations()