Class HiveIcebergStorageHandler

  • All Implemented Interfaces:
    org.apache.hadoop.conf.Configurable, org.apache.hadoop.hive.ql.metadata.HiveStorageHandler, org.apache.hadoop.hive.ql.metadata.HiveStoragePredicateHandler

    public class HiveIcebergStorageHandler
    extends java.lang.Object
    implements org.apache.hadoop.hive.ql.metadata.HiveStoragePredicateHandler, org.apache.hadoop.hive.ql.metadata.HiveStorageHandler
    • Nested Class Summary

      • Nested classes/interfaces inherited from interface org.apache.hadoop.hive.ql.metadata.HiveStoragePredicateHandler

        org.apache.hadoop.hive.ql.metadata.HiveStoragePredicateHandler.DecomposedPredicate
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      static java.lang.String catalogName​(org.apache.hadoop.conf.Configuration config, java.lang.String name)
      Returns the catalog name serialized to the configuration.
      static void checkAndSetIoConfig​(org.apache.hadoop.conf.Configuration config, Table table)
      If enabled, it populates the FileIO's hadoop configuration with the input config object.
      static void checkAndSkipIoConfigSerialization​(org.apache.hadoop.conf.Configuration config, Table table)
      If enabled, it ensures that the FileIO's hadoop configuration will not be serialized.
      void configureInputJobCredentials​(org.apache.hadoop.hive.ql.plan.TableDesc tableDesc, java.util.Map<java.lang.String,​java.lang.String> secrets)  
      void configureInputJobProperties​(org.apache.hadoop.hive.ql.plan.TableDesc tableDesc, java.util.Map<java.lang.String,​java.lang.String> map)  
      void configureJobConf​(org.apache.hadoop.hive.ql.plan.TableDesc tableDesc, org.apache.hadoop.mapred.JobConf jobConf)  
      void configureOutputJobProperties​(org.apache.hadoop.hive.ql.plan.TableDesc tableDesc, java.util.Map<java.lang.String,​java.lang.String> map)  
      void configureTableJobProperties​(org.apache.hadoop.hive.ql.plan.TableDesc tableDesc, java.util.Map<java.lang.String,​java.lang.String> map)  
      org.apache.hadoop.hive.ql.metadata.HiveStoragePredicateHandler.DecomposedPredicate decomposePredicate​(org.apache.hadoop.mapred.JobConf jobConf, org.apache.hadoop.hive.serde2.Deserializer deserializer, org.apache.hadoop.hive.ql.plan.ExprNodeDesc exprNodeDesc)  
      org.apache.hadoop.hive.ql.security.authorization.HiveAuthorizationProvider getAuthorizationProvider()  
      org.apache.hadoop.conf.Configuration getConf()  
      java.lang.Class<? extends org.apache.hadoop.mapred.InputFormat> getInputFormatClass()  
      org.apache.hadoop.hive.metastore.HiveMetaHook getMetaHook()  
      java.lang.Class<? extends org.apache.hadoop.mapred.OutputFormat> getOutputFormatClass()  
      java.lang.Class<? extends org.apache.hadoop.hive.serde2.AbstractSerDe> getSerDeClass()  
      static java.util.Collection<java.lang.String> outputTables​(org.apache.hadoop.conf.Configuration config)
      Returns the names of the output tables stored in the configuration.
      static Schema schema​(org.apache.hadoop.conf.Configuration config)
      Returns the Table Schema serialized to the configuration.
      void setConf​(org.apache.hadoop.conf.Configuration conf)  
      static Table table​(org.apache.hadoop.conf.Configuration config, java.lang.String name)
      Returns the Table serialized to the configuration based on the table name.
      java.lang.String toString()  
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
    • Constructor Detail

      • HiveIcebergStorageHandler

        public HiveIcebergStorageHandler()
    • Method Detail

      • getInputFormatClass

        public java.lang.Class<? extends org.apache.hadoop.mapred.InputFormat> getInputFormatClass()
        Specified by:
        getInputFormatClass in interface org.apache.hadoop.hive.ql.metadata.HiveStorageHandler
      • getOutputFormatClass

        public java.lang.Class<? extends org.apache.hadoop.mapred.OutputFormat> getOutputFormatClass()
        Specified by:
        getOutputFormatClass in interface org.apache.hadoop.hive.ql.metadata.HiveStorageHandler
      • getSerDeClass

        public java.lang.Class<? extends org.apache.hadoop.hive.serde2.AbstractSerDe> getSerDeClass()
        Specified by:
        getSerDeClass in interface org.apache.hadoop.hive.ql.metadata.HiveStorageHandler
      • getMetaHook

        public org.apache.hadoop.hive.metastore.HiveMetaHook getMetaHook()
        Specified by:
        getMetaHook in interface org.apache.hadoop.hive.ql.metadata.HiveStorageHandler
      • getAuthorizationProvider

        public org.apache.hadoop.hive.ql.security.authorization.HiveAuthorizationProvider getAuthorizationProvider()
        Specified by:
        getAuthorizationProvider in interface org.apache.hadoop.hive.ql.metadata.HiveStorageHandler
      • configureInputJobProperties

        public void configureInputJobProperties​(org.apache.hadoop.hive.ql.plan.TableDesc tableDesc,
                                                java.util.Map<java.lang.String,​java.lang.String> map)
        Specified by:
        configureInputJobProperties in interface org.apache.hadoop.hive.ql.metadata.HiveStorageHandler
      • configureOutputJobProperties

        public void configureOutputJobProperties​(org.apache.hadoop.hive.ql.plan.TableDesc tableDesc,
                                                 java.util.Map<java.lang.String,​java.lang.String> map)
        Specified by:
        configureOutputJobProperties in interface org.apache.hadoop.hive.ql.metadata.HiveStorageHandler
      • configureTableJobProperties

        public void configureTableJobProperties​(org.apache.hadoop.hive.ql.plan.TableDesc tableDesc,
                                                java.util.Map<java.lang.String,​java.lang.String> map)
        Specified by:
        configureTableJobProperties in interface org.apache.hadoop.hive.ql.metadata.HiveStorageHandler
      • configureInputJobCredentials

        public void configureInputJobCredentials​(org.apache.hadoop.hive.ql.plan.TableDesc tableDesc,
                                                 java.util.Map<java.lang.String,​java.lang.String> secrets)
      • configureJobConf

        public void configureJobConf​(org.apache.hadoop.hive.ql.plan.TableDesc tableDesc,
                                     org.apache.hadoop.mapred.JobConf jobConf)
        Specified by:
        configureJobConf in interface org.apache.hadoop.hive.ql.metadata.HiveStorageHandler
      • getConf

        public org.apache.hadoop.conf.Configuration getConf()
        Specified by:
        getConf in interface org.apache.hadoop.conf.Configurable
      • setConf

        public void setConf​(org.apache.hadoop.conf.Configuration conf)
        Specified by:
        setConf in interface org.apache.hadoop.conf.Configurable
      • toString

        public java.lang.String toString()
        Overrides:
        toString in class java.lang.Object
      • decomposePredicate

        public org.apache.hadoop.hive.ql.metadata.HiveStoragePredicateHandler.DecomposedPredicate decomposePredicate​(org.apache.hadoop.mapred.JobConf jobConf,
                                                                                                                     org.apache.hadoop.hive.serde2.Deserializer deserializer,
                                                                                                                     org.apache.hadoop.hive.ql.plan.ExprNodeDesc exprNodeDesc)
        Specified by:
        decomposePredicate in interface org.apache.hadoop.hive.ql.metadata.HiveStoragePredicateHandler
        Parameters:
        jobConf - Job configuration for InputFormat to access
        deserializer - Deserializer
        exprNodeDesc - Filter expression extracted by Hive
        Returns:
        Entire filter to take advantage of Hive's pruning as well as Iceberg's pruning.
      • table

        public static Table table​(org.apache.hadoop.conf.Configuration config,
                                  java.lang.String name)
        Returns the Table serialized to the configuration based on the table name. If configuration is missing from the FileIO of the table, it will be populated with the input config.
        Parameters:
        config - The configuration used to get the data from
        name - The name of the table we need as returned by TableDesc.getTableName()
        Returns:
        The Table
      • checkAndSetIoConfig

        public static void checkAndSetIoConfig​(org.apache.hadoop.conf.Configuration config,
                                               Table table)
        If enabled, it populates the FileIO's hadoop configuration with the input config object. This might be necessary when the table object was serialized without the FileIO config.
        Parameters:
        config - Configuration to set for FileIO, if enabled
        table - The Iceberg table object
      • checkAndSkipIoConfigSerialization

        public static void checkAndSkipIoConfigSerialization​(org.apache.hadoop.conf.Configuration config,
                                                             Table table)
        If enabled, it ensures that the FileIO's hadoop configuration will not be serialized. This might be desirable for decreasing the overall size of serialized table objects.

        Note: Skipping FileIO config serialization in this fashion might in turn necessitate calling checkAndSetIoConfig(Configuration, Table) on the deserializer-side to enable subsequent use of the FileIO.

        Parameters:
        config - Configuration to set for FileIO in a transient manner, if enabled
        table - The Iceberg table object
      • outputTables

        public static java.util.Collection<java.lang.String> outputTables​(org.apache.hadoop.conf.Configuration config)
        Returns the names of the output tables stored in the configuration.
        Parameters:
        config - The configuration used to get the data from
        Returns:
        The collection of the table names as returned by TableDesc.getTableName()
      • catalogName

        public static java.lang.String catalogName​(org.apache.hadoop.conf.Configuration config,
                                                   java.lang.String name)
        Returns the catalog name serialized to the configuration.
        Parameters:
        config - The configuration used to get the data from
        name - The name of the table we neeed as returned by TableDesc.getTableName()
        Returns:
        catalog name
      • schema

        public static Schema schema​(org.apache.hadoop.conf.Configuration config)
        Returns the Table Schema serialized to the configuration.
        Parameters:
        config - The configuration used to get the data from
        Returns:
        The Table Schema object