Class VectorizedArrowReader
java.lang.Object
org.apache.iceberg.arrow.vectorized.VectorizedArrowReader
- All Implemented Interfaces:
- VectorizedReader<VectorHolder>
- Direct Known Subclasses:
- VectorizedArrowReader.ConstantVectorReader,- VectorizedArrowReader.DeletedVectorReader
VectorReader(s) that read in a batch of values into Arrow vectors. It
 also takes care of allocating the right kind of Arrow vectors depending on the corresponding
 Iceberg/Parquet data types.- 
Nested Class SummaryNested ClassesModifier and TypeClassDescriptionstatic classA Dummy Vector Reader which doesn't actually read files, instead it returns a dummy VectorHolder which indicates the constant value which should be used for this column.static classA Dummy Vector Reader which doesn't actually read files.
- 
Field SummaryFields
- 
Constructor SummaryConstructorsConstructorDescriptionVectorizedArrowReader(org.apache.parquet.column.ColumnDescriptor desc, Types.NestedField icebergField, org.apache.arrow.memory.BufferAllocator ra, boolean setArrowValidityVector) 
- 
Method SummaryModifier and TypeMethodDescriptionvoidclose()Release any resources allocated.protected Types.NestedFieldstatic VectorizedArrowReadernulls()static VectorizedArrowReaderstatic VectorizedArrowReaderread(VectorHolder reuse, int numValsToRead) Reads a batch of type @param <T> and of size numRowsvoidsetBatchSize(int batchSize) voidsetRowGroupInfo(org.apache.parquet.column.page.PageReadStore source, Map<org.apache.parquet.hadoop.metadata.ColumnPath, org.apache.parquet.hadoop.metadata.ColumnChunkMetaData> metadata) Sets the row group information to be used with this readertoString()
- 
Field Details- 
DEFAULT_BATCH_SIZEpublic static final int DEFAULT_BATCH_SIZE- See Also:
 
 
- 
- 
Constructor Details- 
VectorizedArrowReaderpublic VectorizedArrowReader(org.apache.parquet.column.ColumnDescriptor desc, Types.NestedField icebergField, org.apache.arrow.memory.BufferAllocator ra, boolean setArrowValidityVector) 
 
- 
- 
Method Details- 
icebergField
- 
setBatchSizepublic void setBatchSize(int batchSize) - Specified by:
- setBatchSizein interface- VectorizedReader<VectorHolder>
 
- 
readDescription copied from interface:VectorizedReaderReads a batch of type @param <T> and of size numRows- Specified by:
- readin interface- VectorizedReader<VectorHolder>
- Parameters:
- reuse- container for the last batch to be reused for next batch
- numValsToRead- number of rows to read
- Returns:
- batch of records of type @param <T>
 
- 
setRowGroupInfopublic void setRowGroupInfo(org.apache.parquet.column.page.PageReadStore source, Map<org.apache.parquet.hadoop.metadata.ColumnPath, org.apache.parquet.hadoop.metadata.ColumnChunkMetaData> metadata) Description copied from interface:VectorizedReaderSets the row group information to be used with this reader- Specified by:
- setRowGroupInfoin interface- VectorizedReader<VectorHolder>
- Parameters:
- source- row group information for all the columns
- metadata- map of- ColumnPath->- ColumnChunkMetaDatafor the row group
 
- 
closepublic void close()Description copied from interface:VectorizedReaderRelease any resources allocated.- Specified by:
- closein interface- VectorizedReader<VectorHolder>
 
- 
toString
- 
nulls
- 
positions
- 
positionsWithSetArrowValidityVector
 
-