Class VectorizedArrowReader
- java.lang.Object
- 
- org.apache.iceberg.arrow.vectorized.VectorizedArrowReader
 
- 
- All Implemented Interfaces:
- VectorizedReader<VectorHolder>
 - Direct Known Subclasses:
- VectorizedArrowReader.ConstantVectorReader
 
 public class VectorizedArrowReader extends java.lang.Object implements VectorizedReader<VectorHolder> VectorReader(s)that read in a batch of values into Arrow vectors. It also takes care of allocating the right kind of Arrow vectors depending on the corresponding Iceberg/Parquet data types.
- 
- 
Nested Class SummaryNested Classes Modifier and Type Class Description static classVectorizedArrowReader.ConstantVectorReader<T>A Dummy Vector Reader which doesn't actually read files, instead it returns a dummy VectorHolder which indicates the constant value which should be used for this column.
 - 
Field SummaryFields Modifier and Type Field Description static intDEFAULT_BATCH_SIZE
 - 
Constructor SummaryConstructors Constructor Description VectorizedArrowReader(org.apache.parquet.column.ColumnDescriptor desc, Types.NestedField icebergField, org.apache.arrow.memory.BufferAllocator ra, boolean setArrowValidityVector)
 - 
Method SummaryAll Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description voidclose()Release any resources allocated.static VectorizedArrowReadernulls()static VectorizedArrowReaderpositions()VectorHolderread(VectorHolder reuse, int numValsToRead)Reads a batch of type @param <T> and of size numRowsvoidsetBatchSize(int batchSize)voidsetRowGroupInfo(org.apache.parquet.column.page.PageReadStore source, java.util.Map<org.apache.parquet.hadoop.metadata.ColumnPath,org.apache.parquet.hadoop.metadata.ColumnChunkMetaData> metadata, long rowPosition)Sets the row group information to be used with this readerjava.lang.StringtoString()
 
- 
- 
- 
Field Detail- 
DEFAULT_BATCH_SIZEpublic static final int DEFAULT_BATCH_SIZE - See Also:
- Constant Field Values
 
 
- 
 - 
Constructor Detail- 
VectorizedArrowReaderpublic VectorizedArrowReader(org.apache.parquet.column.ColumnDescriptor desc, Types.NestedField icebergField, org.apache.arrow.memory.BufferAllocator ra, boolean setArrowValidityVector)
 
- 
 - 
Method Detail- 
setBatchSizepublic void setBatchSize(int batchSize) - Specified by:
- setBatchSizein interface- VectorizedReader<VectorHolder>
 
 - 
readpublic VectorHolder read(VectorHolder reuse, int numValsToRead) Description copied from interface:VectorizedReaderReads a batch of type @param <T> and of size numRows- Specified by:
- readin interface- VectorizedReader<VectorHolder>
- Parameters:
- reuse- container for the last batch to be reused for next batch
- numValsToRead- number of rows to read
- Returns:
- batch of records of type @param <T>
 
 - 
setRowGroupInfopublic void setRowGroupInfo(org.apache.parquet.column.page.PageReadStore source, java.util.Map<org.apache.parquet.hadoop.metadata.ColumnPath,org.apache.parquet.hadoop.metadata.ColumnChunkMetaData> metadata, long rowPosition)Description copied from interface:VectorizedReaderSets the row group information to be used with this reader- Specified by:
- setRowGroupInfoin interface- VectorizedReader<VectorHolder>
- Parameters:
- source- row group information for all the columns
- metadata- map of- ColumnPath->- ColumnChunkMetaDatafor the row group
- rowPosition- the row group's row offset in the parquet file
 
 - 
closepublic void close() Description copied from interface:VectorizedReaderRelease any resources allocated.- Specified by:
- closein interface- VectorizedReader<VectorHolder>
 
 - 
toStringpublic java.lang.String toString() - Overrides:
- toStringin class- java.lang.Object
 
 - 
nullspublic static VectorizedArrowReader nulls() 
 - 
positionspublic static VectorizedArrowReader positions() 
 
- 
 
-