Class VectorizedArrowReader
- java.lang.Object
-
- org.apache.iceberg.arrow.vectorized.VectorizedArrowReader
-
- All Implemented Interfaces:
VectorizedReader<VectorHolder>
- Direct Known Subclasses:
VectorizedArrowReader.ConstantVectorReader,VectorizedArrowReader.DeletedVectorReader
public class VectorizedArrowReader extends java.lang.Object implements VectorizedReader<VectorHolder>
VectorReader(s)that read in a batch of values into Arrow vectors. It also takes care of allocating the right kind of Arrow vectors depending on the corresponding Iceberg/Parquet data types.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static classVectorizedArrowReader.ConstantVectorReader<T>A Dummy Vector Reader which doesn't actually read files, instead it returns a dummy VectorHolder which indicates the constant value which should be used for this column.static classVectorizedArrowReader.DeletedVectorReaderA Dummy Vector Reader which doesn't actually read files.
-
Field Summary
Fields Modifier and Type Field Description static intDEFAULT_BATCH_SIZE
-
Constructor Summary
Constructors Constructor Description VectorizedArrowReader(org.apache.parquet.column.ColumnDescriptor desc, Types.NestedField icebergField, org.apache.arrow.memory.BufferAllocator ra, boolean setArrowValidityVector)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description voidclose()Release any resources allocated.protected Types.NestedFieldicebergField()static VectorizedArrowReadernulls()static VectorizedArrowReaderpositions()static VectorizedArrowReaderpositionsWithSetArrowValidityVector()VectorHolderread(VectorHolder reuse, int numValsToRead)Reads a batch of type @param <T> and of size numRowsvoidsetBatchSize(int batchSize)voidsetRowGroupInfo(org.apache.parquet.column.page.PageReadStore source, java.util.Map<org.apache.parquet.hadoop.metadata.ColumnPath,org.apache.parquet.hadoop.metadata.ColumnChunkMetaData> metadata, long rowPosition)Sets the row group information to be used with this readerjava.lang.StringtoString()
-
-
-
Field Detail
-
DEFAULT_BATCH_SIZE
public static final int DEFAULT_BATCH_SIZE
- See Also:
- Constant Field Values
-
-
Constructor Detail
-
VectorizedArrowReader
public VectorizedArrowReader(org.apache.parquet.column.ColumnDescriptor desc, Types.NestedField icebergField, org.apache.arrow.memory.BufferAllocator ra, boolean setArrowValidityVector)
-
-
Method Detail
-
icebergField
protected Types.NestedField icebergField()
-
setBatchSize
public void setBatchSize(int batchSize)
- Specified by:
setBatchSizein interfaceVectorizedReader<VectorHolder>
-
read
public VectorHolder read(VectorHolder reuse, int numValsToRead)
Description copied from interface:VectorizedReaderReads a batch of type @param <T> and of size numRows- Specified by:
readin interfaceVectorizedReader<VectorHolder>- Parameters:
reuse- container for the last batch to be reused for next batchnumValsToRead- number of rows to read- Returns:
- batch of records of type @param <T>
-
setRowGroupInfo
public void setRowGroupInfo(org.apache.parquet.column.page.PageReadStore source, java.util.Map<org.apache.parquet.hadoop.metadata.ColumnPath,org.apache.parquet.hadoop.metadata.ColumnChunkMetaData> metadata, long rowPosition)Description copied from interface:VectorizedReaderSets the row group information to be used with this reader- Specified by:
setRowGroupInfoin interfaceVectorizedReader<VectorHolder>- Parameters:
source- row group information for all the columnsmetadata- map ofColumnPath->ColumnChunkMetaDatafor the row grouprowPosition- the row group's row offset in the parquet file
-
close
public void close()
Description copied from interface:VectorizedReaderRelease any resources allocated.- Specified by:
closein interfaceVectorizedReader<VectorHolder>
-
toString
public java.lang.String toString()
- Overrides:
toStringin classjava.lang.Object
-
nulls
public static VectorizedArrowReader nulls()
-
positions
public static VectorizedArrowReader positions()
-
positionsWithSetArrowValidityVector
public static VectorizedArrowReader positionsWithSetArrowValidityVector()
-
-