Package org.apache.iceberg.spark.data
Class SparkOrcReader
- java.lang.Object
-
- org.apache.iceberg.spark.data.SparkOrcReader
-
- All Implemented Interfaces:
OrcRowReader<org.apache.spark.sql.catalyst.InternalRow>
public class SparkOrcReader extends java.lang.Object implements OrcRowReader<org.apache.spark.sql.catalyst.InternalRow>
Converts the OrcIterator, which returns ORC's VectorizedRowBatch to a set of Spark's UnsafeRows.It minimizes allocations by reusing most of the objects in the implementation.
-
-
Constructor Summary
Constructors Constructor Description SparkOrcReader(Schema expectedSchema, org.apache.orc.TypeDescription readSchema)
SparkOrcReader(Schema expectedSchema, org.apache.orc.TypeDescription readOrcSchema, java.util.Map<java.lang.Integer,?> idToConstant)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description org.apache.spark.sql.catalyst.InternalRow
read(org.apache.orc.storage.ql.exec.vector.VectorizedRowBatch batch, int row)
Reads a row.void
setBatchContext(long batchOffsetInFile)
-
-
-
Method Detail
-
read
public org.apache.spark.sql.catalyst.InternalRow read(org.apache.orc.storage.ql.exec.vector.VectorizedRowBatch batch, int row)
Description copied from interface:OrcRowReader
Reads a row.- Specified by:
read
in interfaceOrcRowReader<org.apache.spark.sql.catalyst.InternalRow>
-
setBatchContext
public void setBatchContext(long batchOffsetInFile)
- Specified by:
setBatchContext
in interfaceOrcRowReader<org.apache.spark.sql.catalyst.InternalRow>
-
-