Interface VectorizedReader<T>

All Known Implementing Classes:
BaseBatchReader, ColumnarBatchReader, VectorizedArrowReader, VectorizedArrowReader.ConstantVectorReader, VectorizedArrowReader.DeletedVectorReader

public interface VectorizedReader<T>
Interface for vectorized Iceberg readers.
  • Method Summary

    Modifier and Type
    Method
    Description
    void
    Release any resources allocated.
    read(T reuse, int numRows)
    Reads a batch of type @param <T> and of size numRows
    void
    setBatchSize(int batchSize)
     
    void
    setRowGroupInfo(org.apache.parquet.column.page.PageReadStore pages, Map<org.apache.parquet.hadoop.metadata.ColumnPath,org.apache.parquet.hadoop.metadata.ColumnChunkMetaData> metadata, long rowPosition)
    Sets the row group information to be used with this reader
  • Method Details

    • read

      T read(T reuse, int numRows)
      Reads a batch of type @param <T> and of size numRows
      Parameters:
      reuse - container for the last batch to be reused for next batch
      numRows - number of rows to read
      Returns:
      batch of records of type @param <T>
    • setBatchSize

      void setBatchSize(int batchSize)
    • setRowGroupInfo

      void setRowGroupInfo(org.apache.parquet.column.page.PageReadStore pages, Map<org.apache.parquet.hadoop.metadata.ColumnPath,org.apache.parquet.hadoop.metadata.ColumnChunkMetaData> metadata, long rowPosition)
      Sets the row group information to be used with this reader
      Parameters:
      pages - row group information for all the columns
      metadata - map of ColumnPath -> ColumnChunkMetaData for the row group
      rowPosition - the row group's row offset in the parquet file
    • close

      void close()
      Release any resources allocated.