Class ColumnarBatchReader

java.lang.Object
org.apache.iceberg.arrow.vectorized.BaseBatchReader<org.apache.spark.sql.vectorized.ColumnarBatch>
org.apache.iceberg.spark.data.vectorized.ColumnarBatchReader
All Implemented Interfaces:
VectorizedReader<org.apache.spark.sql.vectorized.ColumnarBatch>

public class ColumnarBatchReader extends BaseBatchReader<org.apache.spark.sql.vectorized.ColumnarBatch>
VectorizedReader that returns Spark's ColumnarBatch to support Spark's vectorized read path. The ColumnarBatch returned is created by passing in the Arrow vectors populated via delegated read calls to VectorReader(s).
  • Constructor Details

  • Method Details

    • setRowGroupInfo

      public void setRowGroupInfo(org.apache.parquet.column.page.PageReadStore pageStore, Map<org.apache.parquet.hadoop.metadata.ColumnPath,org.apache.parquet.hadoop.metadata.ColumnChunkMetaData> metaData, long rowPosition)
      Description copied from interface: VectorizedReader
      Sets the row group information to be used with this reader
      Specified by:
      setRowGroupInfo in interface VectorizedReader<org.apache.spark.sql.vectorized.ColumnarBatch>
      Overrides:
      setRowGroupInfo in class BaseBatchReader<org.apache.spark.sql.vectorized.ColumnarBatch>
      Parameters:
      pageStore - row group information for all the columns
      metaData - map of ColumnPath -> ColumnChunkMetaData for the row group
      rowPosition - the row group's row offset in the parquet file
    • setDeleteFilter

      public void setDeleteFilter(DeleteFilter<org.apache.spark.sql.catalyst.InternalRow> deleteFilter)
    • read

      public final org.apache.spark.sql.vectorized.ColumnarBatch read(org.apache.spark.sql.vectorized.ColumnarBatch reuse, int numRowsToRead)
      Description copied from interface: VectorizedReader
      Reads a batch of type @param <T> and of size numRows
      Parameters:
      reuse - container for the last batch to be reused for next batch
      numRowsToRead - number of rows to read
      Returns:
      batch of records of type @param <T>