Class ColumnarBatchReader
java.lang.Object
org.apache.iceberg.arrow.vectorized.BaseBatchReader<org.apache.spark.sql.vectorized.ColumnarBatch>
org.apache.iceberg.spark.data.vectorized.ColumnarBatchReader
- All Implemented Interfaces:
- VectorizedReader<org.apache.spark.sql.vectorized.ColumnarBatch>
public class ColumnarBatchReader
extends BaseBatchReader<org.apache.spark.sql.vectorized.ColumnarBatch>
VectorizedReader that returns Spark's ColumnarBatch to support Spark's vectorized
 read path. The ColumnarBatch returned is created by passing in the Arrow vectors
 populated via delegated read calls to VectorReader(s).- 
Field SummaryFields inherited from class org.apache.iceberg.arrow.vectorized.BaseBatchReaderreaders, vectorHolders
- 
Constructor SummaryConstructors
- 
Method SummaryModifier and TypeMethodDescriptionfinal org.apache.spark.sql.vectorized.ColumnarBatchread(org.apache.spark.sql.vectorized.ColumnarBatch reuse, int numRowsToRead) Reads a batch of type @param <T> and of size numRowsvoidsetDeleteFilter(DeleteFilter<org.apache.spark.sql.catalyst.InternalRow> deleteFilter) voidsetRowGroupInfo(org.apache.parquet.column.page.PageReadStore pageStore, Map<org.apache.parquet.hadoop.metadata.ColumnPath, org.apache.parquet.hadoop.metadata.ColumnChunkMetaData> metaData, long rowPosition) Sets the row group information to be used with this readerMethods inherited from class org.apache.iceberg.arrow.vectorized.BaseBatchReaderclose, closeVectors, setBatchSize
- 
Constructor Details- 
ColumnarBatchReader
 
- 
- 
Method Details- 
setRowGroupInfopublic void setRowGroupInfo(org.apache.parquet.column.page.PageReadStore pageStore, Map<org.apache.parquet.hadoop.metadata.ColumnPath, org.apache.parquet.hadoop.metadata.ColumnChunkMetaData> metaData, long rowPosition) Description copied from interface:VectorizedReaderSets the row group information to be used with this reader- Specified by:
- setRowGroupInfoin interface- VectorizedReader<org.apache.spark.sql.vectorized.ColumnarBatch>
- Overrides:
- setRowGroupInfoin class- BaseBatchReader<org.apache.spark.sql.vectorized.ColumnarBatch>
- Parameters:
- pageStore- row group information for all the columns
- metaData- map of- ColumnPath->- ColumnChunkMetaDatafor the row group
- rowPosition- the row group's row offset in the parquet file
 
- 
setDeleteFilter
- 
readpublic final org.apache.spark.sql.vectorized.ColumnarBatch read(org.apache.spark.sql.vectorized.ColumnarBatch reuse, int numRowsToRead) Description copied from interface:VectorizedReaderReads a batch of type @param <T> and of size numRows- Parameters:
- reuse- container for the last batch to be reused for next batch
- numRowsToRead- number of rows to read
- Returns:
- batch of records of type @param <T>
 
 
-