Class ColumnarBatchReader
- java.lang.Object
-
- org.apache.iceberg.arrow.vectorized.BaseBatchReader<org.apache.spark.sql.vectorized.ColumnarBatch>
-
- org.apache.iceberg.spark.data.vectorized.ColumnarBatchReader
-
- All Implemented Interfaces:
VectorizedReader<org.apache.spark.sql.vectorized.ColumnarBatch>
public class ColumnarBatchReader extends BaseBatchReader<org.apache.spark.sql.vectorized.ColumnarBatch>
VectorizedReaderthat returns Spark'sColumnarBatchto support Spark's vectorized read path. TheColumnarBatchreturned is created by passing in the Arrow vectors populated via delegated read calls to VectorReader(s).
-
-
Field Summary
-
Fields inherited from class org.apache.iceberg.arrow.vectorized.BaseBatchReader
readers, vectorHolders
-
-
Constructor Summary
Constructors Constructor Description ColumnarBatchReader(java.util.List<VectorizedReader<?>> readers)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description org.apache.spark.sql.vectorized.ColumnarBatchread(org.apache.spark.sql.vectorized.ColumnarBatch reuse, int numRowsToRead)Reads a batch of type @param <T> and of size numRowsvoidsetDeleteFilter(DeleteFilter<org.apache.spark.sql.catalyst.InternalRow> deleteFilter)voidsetRowGroupInfo(org.apache.parquet.column.page.PageReadStore pageStore, java.util.Map<org.apache.parquet.hadoop.metadata.ColumnPath,org.apache.parquet.hadoop.metadata.ColumnChunkMetaData> metaData, long rowPosition)Sets the row group information to be used with this reader-
Methods inherited from class org.apache.iceberg.arrow.vectorized.BaseBatchReader
close, closeVectors, setBatchSize
-
-
-
-
Constructor Detail
-
ColumnarBatchReader
public ColumnarBatchReader(java.util.List<VectorizedReader<?>> readers)
-
-
Method Detail
-
setRowGroupInfo
public void setRowGroupInfo(org.apache.parquet.column.page.PageReadStore pageStore, java.util.Map<org.apache.parquet.hadoop.metadata.ColumnPath,org.apache.parquet.hadoop.metadata.ColumnChunkMetaData> metaData, long rowPosition)Description copied from interface:VectorizedReaderSets the row group information to be used with this reader- Specified by:
setRowGroupInfoin interfaceVectorizedReader<org.apache.spark.sql.vectorized.ColumnarBatch>- Overrides:
setRowGroupInfoin classBaseBatchReader<org.apache.spark.sql.vectorized.ColumnarBatch>- Parameters:
pageStore- row group information for all the columnsmetaData- map ofColumnPath->ColumnChunkMetaDatafor the row grouprowPosition- the row group's row offset in the parquet file
-
setDeleteFilter
public void setDeleteFilter(DeleteFilter<org.apache.spark.sql.catalyst.InternalRow> deleteFilter)
-
read
public final org.apache.spark.sql.vectorized.ColumnarBatch read(org.apache.spark.sql.vectorized.ColumnarBatch reuse, int numRowsToRead)Description copied from interface:VectorizedReaderReads a batch of type @param <T> and of size numRows- Parameters:
reuse- container for the last batch to be reused for next batchnumRowsToRead- number of rows to read- Returns:
- batch of records of type @param <T>
-
-