Class VectorizedArrowReader
- java.lang.Object
-
- org.apache.iceberg.arrow.vectorized.VectorizedArrowReader
-
- All Implemented Interfaces:
VectorizedReader<VectorHolder>
public class VectorizedArrowReader extends java.lang.Object implements VectorizedReader<VectorHolder>
VectorReader(s)
that read in a batch of values into Arrow vectors. It also takes care of allocating the right kind of Arrow vectors depending on the corresponding Iceberg/Parquet data types.
-
-
Field Summary
Fields Modifier and Type Field Description static int
DEFAULT_BATCH_SIZE
-
Constructor Summary
Constructors Constructor Description VectorizedArrowReader(org.apache.parquet.column.ColumnDescriptor desc, Types.NestedField icebergField, org.apache.arrow.memory.BufferAllocator ra, boolean setArrowValidityVector)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description void
close()
Release any resources allocated.static VectorizedArrowReader
nulls()
VectorHolder
read(VectorHolder reuse, int numValsToRead)
Reads a batch of type @param <T> and of size numRowsvoid
setBatchSize(int batchSize)
void
setRowGroupInfo(org.apache.parquet.column.page.PageReadStore source, java.util.Map<org.apache.parquet.hadoop.metadata.ColumnPath,org.apache.parquet.hadoop.metadata.ColumnChunkMetaData> metadata)
java.lang.String
toString()
-
-
-
Field Detail
-
DEFAULT_BATCH_SIZE
public static final int DEFAULT_BATCH_SIZE
- See Also:
- Constant Field Values
-
-
Constructor Detail
-
VectorizedArrowReader
public VectorizedArrowReader(org.apache.parquet.column.ColumnDescriptor desc, Types.NestedField icebergField, org.apache.arrow.memory.BufferAllocator ra, boolean setArrowValidityVector)
-
-
Method Detail
-
setBatchSize
public void setBatchSize(int batchSize)
- Specified by:
setBatchSize
in interfaceVectorizedReader<VectorHolder>
-
read
public VectorHolder read(VectorHolder reuse, int numValsToRead)
Description copied from interface:VectorizedReader
Reads a batch of type @param <T> and of size numRows- Specified by:
read
in interfaceVectorizedReader<VectorHolder>
- Parameters:
reuse
- container for the last batch to be reused for next batchnumValsToRead
- number of rows to read- Returns:
- batch of records of type @param <T>
-
setRowGroupInfo
public void setRowGroupInfo(org.apache.parquet.column.page.PageReadStore source, java.util.Map<org.apache.parquet.hadoop.metadata.ColumnPath,org.apache.parquet.hadoop.metadata.ColumnChunkMetaData> metadata)
- Specified by:
setRowGroupInfo
in interfaceVectorizedReader<VectorHolder>
- Parameters:
source
- row group information for all the columnsmetadata
- map ofColumnPath
->ColumnChunkMetaData
for the row group
-
close
public void close()
Description copied from interface:VectorizedReader
Release any resources allocated.- Specified by:
close
in interfaceVectorizedReader<VectorHolder>
-
toString
public java.lang.String toString()
- Overrides:
toString
in classjava.lang.Object
-
nulls
public static VectorizedArrowReader nulls()
-
-