SparkOrcReader

java.lang.Object

org.apache.iceberg.spark.data.SparkOrcReader

All Implemented Interfaces:: OrcRowReader<org.apache.spark.sql.catalyst.InternalRow>

public class SparkOrcReader extends Object implements OrcRowReader<org.apache.spark.sql.catalyst.InternalRow>

Converts the OrcIterator, which returns ORC's VectorizedRowBatch to a set of Spark's UnsafeRows.

It minimizes allocations by reusing most of the objects in the implementation.

Constructor Summary

Constructors

Constructor

Description

SparkOrcReader(Schema expectedSchema, org.apache.orc.TypeDescription readSchema)

SparkOrcReader(Schema expectedSchema, org.apache.orc.TypeDescription readOrcSchema, Map<Integer,?> idToConstant)
Method Summary

Modifier and Type

Method

Description

org.apache.spark.sql.catalyst.InternalRow

read(org.apache.orc.storage.ql.exec.vector.VectorizedRowBatch batch, int row)

Reads a row.

void

setBatchContext(long batchOffsetInFile)

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Details
- SparkOrcReader
  
  public SparkOrcReader(Schema expectedSchema, org.apache.orc.TypeDescription readSchema)
- SparkOrcReader
  
  public SparkOrcReader(Schema expectedSchema, org.apache.orc.TypeDescription readOrcSchema, Map<Integer,?> idToConstant)
Method Details
- read
  
  public org.apache.spark.sql.catalyst.InternalRow read(org.apache.orc.storage.ql.exec.vector.VectorizedRowBatch batch, int row)
  
  Description copied from interface: OrcRowReader
  
  Reads a row.
  
  Specified by:
  
  read in interface OrcRowReader<org.apache.spark.sql.catalyst.InternalRow>
- setBatchContext
  
  public void setBatchContext(long batchOffsetInFile)
  
  Specified by:
  
  setBatchContext in interface OrcRowReader<org.apache.spark.sql.catalyst.InternalRow>