Package org.apache.iceberg.mr.hive
Class HiveIcebergStorageHandler
- java.lang.Object
-
- org.apache.iceberg.mr.hive.HiveIcebergStorageHandler
-
- All Implemented Interfaces:
org.apache.hadoop.conf.Configurable
,org.apache.hadoop.hive.ql.metadata.HiveStorageHandler
,org.apache.hadoop.hive.ql.metadata.HiveStoragePredicateHandler
public class HiveIcebergStorageHandler extends java.lang.Object implements org.apache.hadoop.hive.ql.metadata.HiveStoragePredicateHandler, org.apache.hadoop.hive.ql.metadata.HiveStorageHandler
-
-
Constructor Summary
Constructors Constructor Description HiveIcebergStorageHandler()
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description static java.lang.String
catalogName(org.apache.hadoop.conf.Configuration config, java.lang.String name)
Returns the catalog name serialized to the configuration.static void
checkAndSetIoConfig(org.apache.hadoop.conf.Configuration config, Table table)
If enabled, it populates the FileIO's hadoop configuration with the input config object.static void
checkAndSkipIoConfigSerialization(org.apache.hadoop.conf.Configuration config, Table table)
If enabled, it ensures that the FileIO's hadoop configuration will not be serialized.void
configureInputJobCredentials(org.apache.hadoop.hive.ql.plan.TableDesc tableDesc, java.util.Map<java.lang.String,java.lang.String> secrets)
void
configureInputJobProperties(org.apache.hadoop.hive.ql.plan.TableDesc tableDesc, java.util.Map<java.lang.String,java.lang.String> map)
void
configureJobConf(org.apache.hadoop.hive.ql.plan.TableDesc tableDesc, org.apache.hadoop.mapred.JobConf jobConf)
void
configureOutputJobProperties(org.apache.hadoop.hive.ql.plan.TableDesc tableDesc, java.util.Map<java.lang.String,java.lang.String> map)
void
configureTableJobProperties(org.apache.hadoop.hive.ql.plan.TableDesc tableDesc, java.util.Map<java.lang.String,java.lang.String> map)
org.apache.hadoop.hive.ql.metadata.HiveStoragePredicateHandler.DecomposedPredicate
decomposePredicate(org.apache.hadoop.mapred.JobConf jobConf, org.apache.hadoop.hive.serde2.Deserializer deserializer, org.apache.hadoop.hive.ql.plan.ExprNodeDesc exprNodeDesc)
org.apache.hadoop.hive.ql.security.authorization.HiveAuthorizationProvider
getAuthorizationProvider()
org.apache.hadoop.conf.Configuration
getConf()
java.lang.Class<? extends org.apache.hadoop.mapred.InputFormat>
getInputFormatClass()
org.apache.hadoop.hive.metastore.HiveMetaHook
getMetaHook()
java.lang.Class<? extends org.apache.hadoop.mapred.OutputFormat>
getOutputFormatClass()
java.lang.Class<? extends org.apache.hadoop.hive.serde2.AbstractSerDe>
getSerDeClass()
static java.util.Collection<java.lang.String>
outputTables(org.apache.hadoop.conf.Configuration config)
Returns the names of the output tables stored in the configuration.static Schema
schema(org.apache.hadoop.conf.Configuration config)
Returns the Table Schema serialized to the configuration.void
setConf(org.apache.hadoop.conf.Configuration conf)
static Table
table(org.apache.hadoop.conf.Configuration config, java.lang.String name)
Returns the Table serialized to the configuration based on the table name.java.lang.String
toString()
-
-
-
Method Detail
-
getInputFormatClass
public java.lang.Class<? extends org.apache.hadoop.mapred.InputFormat> getInputFormatClass()
- Specified by:
getInputFormatClass
in interfaceorg.apache.hadoop.hive.ql.metadata.HiveStorageHandler
-
getOutputFormatClass
public java.lang.Class<? extends org.apache.hadoop.mapred.OutputFormat> getOutputFormatClass()
- Specified by:
getOutputFormatClass
in interfaceorg.apache.hadoop.hive.ql.metadata.HiveStorageHandler
-
getSerDeClass
public java.lang.Class<? extends org.apache.hadoop.hive.serde2.AbstractSerDe> getSerDeClass()
- Specified by:
getSerDeClass
in interfaceorg.apache.hadoop.hive.ql.metadata.HiveStorageHandler
-
getMetaHook
public org.apache.hadoop.hive.metastore.HiveMetaHook getMetaHook()
- Specified by:
getMetaHook
in interfaceorg.apache.hadoop.hive.ql.metadata.HiveStorageHandler
-
getAuthorizationProvider
public org.apache.hadoop.hive.ql.security.authorization.HiveAuthorizationProvider getAuthorizationProvider()
- Specified by:
getAuthorizationProvider
in interfaceorg.apache.hadoop.hive.ql.metadata.HiveStorageHandler
-
configureInputJobProperties
public void configureInputJobProperties(org.apache.hadoop.hive.ql.plan.TableDesc tableDesc, java.util.Map<java.lang.String,java.lang.String> map)
- Specified by:
configureInputJobProperties
in interfaceorg.apache.hadoop.hive.ql.metadata.HiveStorageHandler
-
configureOutputJobProperties
public void configureOutputJobProperties(org.apache.hadoop.hive.ql.plan.TableDesc tableDesc, java.util.Map<java.lang.String,java.lang.String> map)
- Specified by:
configureOutputJobProperties
in interfaceorg.apache.hadoop.hive.ql.metadata.HiveStorageHandler
-
configureTableJobProperties
public void configureTableJobProperties(org.apache.hadoop.hive.ql.plan.TableDesc tableDesc, java.util.Map<java.lang.String,java.lang.String> map)
- Specified by:
configureTableJobProperties
in interfaceorg.apache.hadoop.hive.ql.metadata.HiveStorageHandler
-
configureInputJobCredentials
public void configureInputJobCredentials(org.apache.hadoop.hive.ql.plan.TableDesc tableDesc, java.util.Map<java.lang.String,java.lang.String> secrets)
-
configureJobConf
public void configureJobConf(org.apache.hadoop.hive.ql.plan.TableDesc tableDesc, org.apache.hadoop.mapred.JobConf jobConf)
- Specified by:
configureJobConf
in interfaceorg.apache.hadoop.hive.ql.metadata.HiveStorageHandler
-
getConf
public org.apache.hadoop.conf.Configuration getConf()
- Specified by:
getConf
in interfaceorg.apache.hadoop.conf.Configurable
-
setConf
public void setConf(org.apache.hadoop.conf.Configuration conf)
- Specified by:
setConf
in interfaceorg.apache.hadoop.conf.Configurable
-
toString
public java.lang.String toString()
- Overrides:
toString
in classjava.lang.Object
-
decomposePredicate
public org.apache.hadoop.hive.ql.metadata.HiveStoragePredicateHandler.DecomposedPredicate decomposePredicate(org.apache.hadoop.mapred.JobConf jobConf, org.apache.hadoop.hive.serde2.Deserializer deserializer, org.apache.hadoop.hive.ql.plan.ExprNodeDesc exprNodeDesc)
- Specified by:
decomposePredicate
in interfaceorg.apache.hadoop.hive.ql.metadata.HiveStoragePredicateHandler
- Parameters:
jobConf
- Job configuration for InputFormat to accessdeserializer
- DeserializerexprNodeDesc
- Filter expression extracted by Hive- Returns:
- Entire filter to take advantage of Hive's pruning as well as Iceberg's pruning.
-
table
public static Table table(org.apache.hadoop.conf.Configuration config, java.lang.String name)
Returns the Table serialized to the configuration based on the table name. If configuration is missing from the FileIO of the table, it will be populated with the input config.- Parameters:
config
- The configuration used to get the data fromname
- The name of the table we need as returned by TableDesc.getTableName()- Returns:
- The Table
-
checkAndSetIoConfig
public static void checkAndSetIoConfig(org.apache.hadoop.conf.Configuration config, Table table)
If enabled, it populates the FileIO's hadoop configuration with the input config object. This might be necessary when the table object was serialized without the FileIO config.- Parameters:
config
- Configuration to set for FileIO, if enabledtable
- The Iceberg table object
-
checkAndSkipIoConfigSerialization
public static void checkAndSkipIoConfigSerialization(org.apache.hadoop.conf.Configuration config, Table table)
If enabled, it ensures that the FileIO's hadoop configuration will not be serialized. This might be desirable for decreasing the overall size of serialized table objects.Note: Skipping FileIO config serialization in this fashion might in turn necessitate calling
checkAndSetIoConfig(Configuration, Table)
on the deserializer-side to enable subsequent use of the FileIO.- Parameters:
config
- Configuration to set for FileIO in a transient manner, if enabledtable
- The Iceberg table object
-
outputTables
public static java.util.Collection<java.lang.String> outputTables(org.apache.hadoop.conf.Configuration config)
Returns the names of the output tables stored in the configuration.- Parameters:
config
- The configuration used to get the data from- Returns:
- The collection of the table names as returned by TableDesc.getTableName()
-
catalogName
public static java.lang.String catalogName(org.apache.hadoop.conf.Configuration config, java.lang.String name)
Returns the catalog name serialized to the configuration.- Parameters:
config
- The configuration used to get the data fromname
- The name of the table we neeed as returned by TableDesc.getTableName()- Returns:
- catalog name
-
schema
public static Schema schema(org.apache.hadoop.conf.Configuration config)
Returns the Table Schema serialized to the configuration.- Parameters:
config
- The configuration used to get the data from- Returns:
- The Table Schema object
-
-