Package org.apache.iceberg.mr.hive
Class HiveIcebergStorageHandler
- java.lang.Object
-
- org.apache.iceberg.mr.hive.HiveIcebergStorageHandler
-
- All Implemented Interfaces:
org.apache.hadoop.conf.Configurable,org.apache.hadoop.hive.ql.metadata.HiveStorageHandler,org.apache.hadoop.hive.ql.metadata.HiveStoragePredicateHandler
public class HiveIcebergStorageHandler extends java.lang.Object implements org.apache.hadoop.hive.ql.metadata.HiveStoragePredicateHandler, org.apache.hadoop.hive.ql.metadata.HiveStorageHandler
-
-
Constructor Summary
Constructors Constructor Description HiveIcebergStorageHandler()
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description static java.lang.StringcatalogName(org.apache.hadoop.conf.Configuration config, java.lang.String name)Returns the catalog name serialized to the configuration.static voidcheckAndSetIoConfig(org.apache.hadoop.conf.Configuration config, Table table)If enabled, it populates the FileIO's hadoop configuration with the input config object.static voidcheckAndSkipIoConfigSerialization(org.apache.hadoop.conf.Configuration config, Table table)If enabled, it ensures that the FileIO's hadoop configuration will not be serialized.voidconfigureInputJobCredentials(org.apache.hadoop.hive.ql.plan.TableDesc tableDesc, java.util.Map<java.lang.String,java.lang.String> secrets)voidconfigureInputJobProperties(org.apache.hadoop.hive.ql.plan.TableDesc tableDesc, java.util.Map<java.lang.String,java.lang.String> map)voidconfigureJobConf(org.apache.hadoop.hive.ql.plan.TableDesc tableDesc, org.apache.hadoop.mapred.JobConf jobConf)voidconfigureOutputJobProperties(org.apache.hadoop.hive.ql.plan.TableDesc tableDesc, java.util.Map<java.lang.String,java.lang.String> map)voidconfigureTableJobProperties(org.apache.hadoop.hive.ql.plan.TableDesc tableDesc, java.util.Map<java.lang.String,java.lang.String> map)org.apache.hadoop.hive.ql.metadata.HiveStoragePredicateHandler.DecomposedPredicatedecomposePredicate(org.apache.hadoop.mapred.JobConf jobConf, org.apache.hadoop.hive.serde2.Deserializer deserializer, org.apache.hadoop.hive.ql.plan.ExprNodeDesc exprNodeDesc)org.apache.hadoop.hive.ql.security.authorization.HiveAuthorizationProvidergetAuthorizationProvider()org.apache.hadoop.conf.ConfigurationgetConf()java.lang.Class<? extends org.apache.hadoop.mapred.InputFormat>getInputFormatClass()org.apache.hadoop.hive.metastore.HiveMetaHookgetMetaHook()java.lang.Class<? extends org.apache.hadoop.mapred.OutputFormat>getOutputFormatClass()java.lang.Class<? extends org.apache.hadoop.hive.serde2.AbstractSerDe>getSerDeClass()static java.util.Collection<java.lang.String>outputTables(org.apache.hadoop.conf.Configuration config)Returns the names of the output tables stored in the configuration.static Schemaschema(org.apache.hadoop.conf.Configuration config)Returns the Table Schema serialized to the configuration.voidsetConf(org.apache.hadoop.conf.Configuration conf)static Tabletable(org.apache.hadoop.conf.Configuration config, java.lang.String name)Returns the Table serialized to the configuration based on the table name.java.lang.StringtoString()
-
-
-
Method Detail
-
getInputFormatClass
public java.lang.Class<? extends org.apache.hadoop.mapred.InputFormat> getInputFormatClass()
- Specified by:
getInputFormatClassin interfaceorg.apache.hadoop.hive.ql.metadata.HiveStorageHandler
-
getOutputFormatClass
public java.lang.Class<? extends org.apache.hadoop.mapred.OutputFormat> getOutputFormatClass()
- Specified by:
getOutputFormatClassin interfaceorg.apache.hadoop.hive.ql.metadata.HiveStorageHandler
-
getSerDeClass
public java.lang.Class<? extends org.apache.hadoop.hive.serde2.AbstractSerDe> getSerDeClass()
- Specified by:
getSerDeClassin interfaceorg.apache.hadoop.hive.ql.metadata.HiveStorageHandler
-
getMetaHook
public org.apache.hadoop.hive.metastore.HiveMetaHook getMetaHook()
- Specified by:
getMetaHookin interfaceorg.apache.hadoop.hive.ql.metadata.HiveStorageHandler
-
getAuthorizationProvider
public org.apache.hadoop.hive.ql.security.authorization.HiveAuthorizationProvider getAuthorizationProvider()
- Specified by:
getAuthorizationProviderin interfaceorg.apache.hadoop.hive.ql.metadata.HiveStorageHandler
-
configureInputJobProperties
public void configureInputJobProperties(org.apache.hadoop.hive.ql.plan.TableDesc tableDesc, java.util.Map<java.lang.String,java.lang.String> map)- Specified by:
configureInputJobPropertiesin interfaceorg.apache.hadoop.hive.ql.metadata.HiveStorageHandler
-
configureOutputJobProperties
public void configureOutputJobProperties(org.apache.hadoop.hive.ql.plan.TableDesc tableDesc, java.util.Map<java.lang.String,java.lang.String> map)- Specified by:
configureOutputJobPropertiesin interfaceorg.apache.hadoop.hive.ql.metadata.HiveStorageHandler
-
configureTableJobProperties
public void configureTableJobProperties(org.apache.hadoop.hive.ql.plan.TableDesc tableDesc, java.util.Map<java.lang.String,java.lang.String> map)- Specified by:
configureTableJobPropertiesin interfaceorg.apache.hadoop.hive.ql.metadata.HiveStorageHandler
-
configureInputJobCredentials
public void configureInputJobCredentials(org.apache.hadoop.hive.ql.plan.TableDesc tableDesc, java.util.Map<java.lang.String,java.lang.String> secrets)
-
configureJobConf
public void configureJobConf(org.apache.hadoop.hive.ql.plan.TableDesc tableDesc, org.apache.hadoop.mapred.JobConf jobConf)- Specified by:
configureJobConfin interfaceorg.apache.hadoop.hive.ql.metadata.HiveStorageHandler
-
getConf
public org.apache.hadoop.conf.Configuration getConf()
- Specified by:
getConfin interfaceorg.apache.hadoop.conf.Configurable
-
setConf
public void setConf(org.apache.hadoop.conf.Configuration conf)
- Specified by:
setConfin interfaceorg.apache.hadoop.conf.Configurable
-
toString
public java.lang.String toString()
- Overrides:
toStringin classjava.lang.Object
-
decomposePredicate
public org.apache.hadoop.hive.ql.metadata.HiveStoragePredicateHandler.DecomposedPredicate decomposePredicate(org.apache.hadoop.mapred.JobConf jobConf, org.apache.hadoop.hive.serde2.Deserializer deserializer, org.apache.hadoop.hive.ql.plan.ExprNodeDesc exprNodeDesc)- Specified by:
decomposePredicatein interfaceorg.apache.hadoop.hive.ql.metadata.HiveStoragePredicateHandler- Parameters:
jobConf- Job configuration for InputFormat to accessdeserializer- DeserializerexprNodeDesc- Filter expression extracted by Hive- Returns:
- Entire filter to take advantage of Hive's pruning as well as Iceberg's pruning.
-
table
public static Table table(org.apache.hadoop.conf.Configuration config, java.lang.String name)
Returns the Table serialized to the configuration based on the table name. If configuration is missing from the FileIO of the table, it will be populated with the input config.- Parameters:
config- The configuration used to get the data fromname- The name of the table we need as returned by TableDesc.getTableName()- Returns:
- The Table
-
checkAndSetIoConfig
public static void checkAndSetIoConfig(org.apache.hadoop.conf.Configuration config, Table table)If enabled, it populates the FileIO's hadoop configuration with the input config object. This might be necessary when the table object was serialized without the FileIO config.- Parameters:
config- Configuration to set for FileIO, if enabledtable- The Iceberg table object
-
checkAndSkipIoConfigSerialization
public static void checkAndSkipIoConfigSerialization(org.apache.hadoop.conf.Configuration config, Table table)If enabled, it ensures that the FileIO's hadoop configuration will not be serialized. This might be desirable for decreasing the overall size of serialized table objects.Note: Skipping FileIO config serialization in this fashion might in turn necessitate calling
checkAndSetIoConfig(Configuration, Table)on the deserializer-side to enable subsequent use of the FileIO.- Parameters:
config- Configuration to set for FileIO in a transient manner, if enabledtable- The Iceberg table object
-
outputTables
public static java.util.Collection<java.lang.String> outputTables(org.apache.hadoop.conf.Configuration config)
Returns the names of the output tables stored in the configuration.- Parameters:
config- The configuration used to get the data from- Returns:
- The collection of the table names as returned by TableDesc.getTableName()
-
catalogName
public static java.lang.String catalogName(org.apache.hadoop.conf.Configuration config, java.lang.String name)Returns the catalog name serialized to the configuration.- Parameters:
config- The configuration used to get the data fromname- The name of the table we neeed as returned by TableDesc.getTableName()- Returns:
- catalog name
-
schema
public static Schema schema(org.apache.hadoop.conf.Configuration config)
Returns the Table Schema serialized to the configuration.- Parameters:
config- The configuration used to get the data from- Returns:
- The Table Schema object
-
-