Package org.apache.iceberg.spark.data
Class SparkOrcReader
java.lang.Object
org.apache.iceberg.spark.data.SparkOrcReader
- All Implemented Interfaces:
OrcRowReader<org.apache.spark.sql.catalyst.InternalRow>
public class SparkOrcReader
extends Object
implements OrcRowReader<org.apache.spark.sql.catalyst.InternalRow>
Converts the OrcIterator, which returns ORC's VectorizedRowBatch to a set of Spark's UnsafeRows.
It minimizes allocations by reusing most of the objects in the implementation.
-
Constructor Summary
ConstructorDescriptionSparkOrcReader
(Schema expectedSchema, org.apache.orc.TypeDescription readSchema) SparkOrcReader
(Schema expectedSchema, org.apache.orc.TypeDescription readOrcSchema, Map<Integer, ?> idToConstant) -
Method Summary
Modifier and TypeMethodDescriptionorg.apache.spark.sql.catalyst.InternalRow
read
(org.apache.orc.storage.ql.exec.vector.VectorizedRowBatch batch, int row) Reads a row.void
setBatchContext
(long batchOffsetInFile)
-
Constructor Details
-
SparkOrcReader
-
SparkOrcReader
-
-
Method Details
-
read
public org.apache.spark.sql.catalyst.InternalRow read(org.apache.orc.storage.ql.exec.vector.VectorizedRowBatch batch, int row) Description copied from interface:OrcRowReader
Reads a row.- Specified by:
read
in interfaceOrcRowReader<org.apache.spark.sql.catalyst.InternalRow>
-
setBatchContext
public void setBatchContext(long batchOffsetInFile) - Specified by:
setBatchContext
in interfaceOrcRowReader<org.apache.spark.sql.catalyst.InternalRow>
-