Class SparkReadConf


  • public class SparkReadConf
    extends java.lang.Object
    A class for common Iceberg configs for Spark reads.

    If a config is set at multiple levels, the following order of precedence is used (top to bottom):

    1. Read options
    2. Session configuration
    3. Table metadata
    The most specific value is set in read options and takes precedence over all other configs. If no read option is provided, this class checks the session configuration for any overrides. If no applicable value is found in the session configuration, this class uses the table metadata.

    Note this class is NOT meant to be serialized and sent to executors.

    • Constructor Detail

      • SparkReadConf

        public SparkReadConf​(org.apache.spark.sql.SparkSession spark,
                             Table table,
                             java.util.Map<java.lang.String,​java.lang.String> readOptions)
      • SparkReadConf

        public SparkReadConf​(org.apache.spark.sql.SparkSession spark,
                             Table table,
                             java.lang.String branch,
                             java.util.Map<java.lang.String,​java.lang.String> readOptions)
    • Method Detail

      • caseSensitive

        public boolean caseSensitive()
      • localityEnabled

        public boolean localityEnabled()
      • snapshotId

        public java.lang.Long snapshotId()
      • asOfTimestamp

        public java.lang.Long asOfTimestamp()
      • startSnapshotId

        public java.lang.Long startSnapshotId()
      • endSnapshotId

        public java.lang.Long endSnapshotId()
      • branch

        public java.lang.String branch()
      • tag

        public java.lang.String tag()
      • scanTaskSetId

        public java.lang.String scanTaskSetId()
      • streamingSkipDeleteSnapshots

        public boolean streamingSkipDeleteSnapshots()
      • streamingSkipOverwriteSnapshots

        public boolean streamingSkipOverwriteSnapshots()
      • parquetVectorizationEnabled

        public boolean parquetVectorizationEnabled()
      • parquetBatchSize

        public int parquetBatchSize()
      • orcVectorizationEnabled

        public boolean orcVectorizationEnabled()
      • orcBatchSize

        public int orcBatchSize()
      • splitSizeOption

        public java.lang.Long splitSizeOption()
      • splitSize

        public long splitSize()
      • splitLookbackOption

        public java.lang.Integer splitLookbackOption()
      • splitLookback

        public int splitLookback()
      • splitOpenFileCostOption

        public java.lang.Long splitOpenFileCostOption()
      • splitOpenFileCost

        public long splitOpenFileCost()
      • streamFromTimestamp

        public long streamFromTimestamp()
      • startTimestamp

        public java.lang.Long startTimestamp()
      • endTimestamp

        public java.lang.Long endTimestamp()
      • maxFilesPerMicroBatch

        public int maxFilesPerMicroBatch()
      • maxRecordsPerMicroBatch

        public int maxRecordsPerMicroBatch()
      • preserveDataGrouping

        public boolean preserveDataGrouping()
      • aggregatePushDownEnabled

        public boolean aggregatePushDownEnabled()
      • adaptiveSplitSizeEnabled

        public boolean adaptiveSplitSizeEnabled()
      • parallelism

        public int parallelism()
      • distributedPlanningEnabled

        public boolean distributedPlanningEnabled()
      • deletePlanningMode

        public PlanningMode deletePlanningMode()
      • executorCacheLocalityEnabled

        public boolean executorCacheLocalityEnabled()
      • reportColumnStats

        public boolean reportColumnStats()