Class IcebergInputFormat<T>

  • Type Parameters:
    T - T is the in memory data model which can either be Pig tuples, Hive rows. Default is Iceberg records

    public class IcebergInputFormat<T>
    extends org.apache.hadoop.mapreduce.InputFormat<java.lang.Void,​T>
    Generic Mrv2 InputFormat API for Iceberg.
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      static InputFormatConfig.ConfigBuilder configure​(org.apache.hadoop.mapreduce.Job job)
      Configures the Job to use the IcebergInputFormat and returns a helper to add further configuration.
      org.apache.hadoop.mapreduce.RecordReader<java.lang.Void,​T> createRecordReader​(org.apache.hadoop.mapreduce.InputSplit split, org.apache.hadoop.mapreduce.TaskAttemptContext context)  
      java.util.List<org.apache.hadoop.mapreduce.InputSplit> getSplits​(org.apache.hadoop.mapreduce.JobContext context)  
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • IcebergInputFormat

        public IcebergInputFormat()
    • Method Detail

      • configure

        public static InputFormatConfig.ConfigBuilder configure​(org.apache.hadoop.mapreduce.Job job)
        Configures the Job to use the IcebergInputFormat and returns a helper to add further configuration.
        Parameters:
        job - the Job to configure
      • getSplits

        public java.util.List<org.apache.hadoop.mapreduce.InputSplit> getSplits​(org.apache.hadoop.mapreduce.JobContext context)
        Specified by:
        getSplits in class org.apache.hadoop.mapreduce.InputFormat<java.lang.Void,​T>
      • createRecordReader

        public org.apache.hadoop.mapreduce.RecordReader<java.lang.Void,​T> createRecordReader​(org.apache.hadoop.mapreduce.InputSplit split,
                                                                                                   org.apache.hadoop.mapreduce.TaskAttemptContext context)
        Specified by:
        createRecordReader in class org.apache.hadoop.mapreduce.InputFormat<java.lang.Void,​T>