IcebergInputFormat

java.lang.Object

org.apache.hadoop.mapreduce.InputFormat<Void,T>

org.apache.iceberg.mr.mapreduce.IcebergInputFormat<T>

Type Parameters:: T - T is the in memory data model which can either be Pig tuples, Hive rows. Default is Iceberg records

public class IcebergInputFormat<T> extends org.apache.hadoop.mapreduce.InputFormat<Void,T>

Generic Mrv2 InputFormat API for Iceberg.

Constructor Summary

Constructors

Constructor

Description

IcebergInputFormat()
Method Summary

Modifier and Type

Method

Description

static InputFormatConfig.ConfigBuilder

configure(org.apache.hadoop.mapreduce.Job job)

Configures the Job to use the IcebergInputFormat and returns a helper to add further configuration.

org.apache.hadoop.mapreduce.RecordReader<Void,T>

createRecordReader(org.apache.hadoop.mapreduce.InputSplit split, org.apache.hadoop.mapreduce.TaskAttemptContext context)

List<org.apache.hadoop.mapreduce.InputSplit>

getSplits(org.apache.hadoop.mapreduce.JobContext context)

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Details
- IcebergInputFormat
  
  public IcebergInputFormat()
Method Details
- configure
  
  public static InputFormatConfig.ConfigBuilder configure(org.apache.hadoop.mapreduce.Job job)
  
  Configures the Job to use the IcebergInputFormat and returns a helper to add further configuration.
  
  Parameters:
  
  job - the Job to configure
- getSplits
  
  public List<org.apache.hadoop.mapreduce.InputSplit> getSplits(org.apache.hadoop.mapreduce.JobContext context)
  
  Specified by:
  
  getSplits in class org.apache.hadoop.mapreduce.InputFormat<Void,T>
- createRecordReader
  
  public org.apache.hadoop.mapreduce.RecordReader<Void,T> createRecordReader(org.apache.hadoop.mapreduce.InputSplit split, org.apache.hadoop.mapreduce.TaskAttemptContext context)
  
  Specified by:
  
  createRecordReader in class org.apache.hadoop.mapreduce.InputFormat<Void,T>