Class IcebergInputFormat<T>

java.lang.Object
org.apache.hadoop.mapreduce.InputFormat<Void,T>
org.apache.iceberg.mr.mapreduce.IcebergInputFormat<T>
Type Parameters:
T - T is the in memory data model which can either be Pig tuples, Hive rows. Default is Iceberg records

public class IcebergInputFormat<T> extends org.apache.hadoop.mapreduce.InputFormat<Void,T>
Generic Mrv2 InputFormat API for Iceberg.
  • Constructor Details

    • IcebergInputFormat

      public IcebergInputFormat()
  • Method Details

    • configure

      public static InputFormatConfig.ConfigBuilder configure(org.apache.hadoop.mapreduce.Job job)
      Configures the Job to use the IcebergInputFormat and returns a helper to add further configuration.
      Parameters:
      job - the Job to configure
    • getSplits

      public List<org.apache.hadoop.mapreduce.InputSplit> getSplits(org.apache.hadoop.mapreduce.JobContext context)
      Specified by:
      getSplits in class org.apache.hadoop.mapreduce.InputFormat<Void,T>
    • createRecordReader

      public org.apache.hadoop.mapreduce.RecordReader<Void,T> createRecordReader(org.apache.hadoop.mapreduce.InputSplit split, org.apache.hadoop.mapreduce.TaskAttemptContext context)
      Specified by:
      createRecordReader in class org.apache.hadoop.mapreduce.InputFormat<Void,T>