Class VectorizedArrowReader

java.lang.Object
org.apache.iceberg.arrow.vectorized.VectorizedArrowReader
All Implemented Interfaces:
VectorizedReader<VectorHolder>
Direct Known Subclasses:
VectorizedArrowReader.ConstantVectorReader, VectorizedArrowReader.DeletedVectorReader

public class VectorizedArrowReader extends Object implements VectorizedReader<VectorHolder>
VectorReader(s) that read in a batch of values into Arrow vectors. It also takes care of allocating the right kind of Arrow vectors depending on the corresponding Iceberg/Parquet data types.
  • Field Details

  • Constructor Details

    • VectorizedArrowReader

      public VectorizedArrowReader(org.apache.parquet.column.ColumnDescriptor desc, Types.NestedField icebergField, org.apache.arrow.memory.BufferAllocator ra, boolean setArrowValidityVector)
  • Method Details

    • icebergField

      protected Types.NestedField icebergField()
    • setBatchSize

      public void setBatchSize(int batchSize)
      Specified by:
      setBatchSize in interface VectorizedReader<VectorHolder>
    • read

      public VectorHolder read(VectorHolder reuse, int numValsToRead)
      Description copied from interface: VectorizedReader
      Reads a batch of type @param <T> and of size numRows
      Specified by:
      read in interface VectorizedReader<VectorHolder>
      Parameters:
      reuse - container for the last batch to be reused for next batch
      numValsToRead - number of rows to read
      Returns:
      batch of records of type @param <T>
    • setRowGroupInfo

      public void setRowGroupInfo(org.apache.parquet.column.page.PageReadStore source, Map<org.apache.parquet.hadoop.metadata.ColumnPath,org.apache.parquet.hadoop.metadata.ColumnChunkMetaData> metadata, long rowPosition)
      Description copied from interface: VectorizedReader
      Sets the row group information to be used with this reader
      Specified by:
      setRowGroupInfo in interface VectorizedReader<VectorHolder>
      Parameters:
      source - row group information for all the columns
      metadata - map of ColumnPath -> ColumnChunkMetaData for the row group
      rowPosition - the row group's row offset in the parquet file
    • close

      public void close()
      Description copied from interface: VectorizedReader
      Release any resources allocated.
      Specified by:
      close in interface VectorizedReader<VectorHolder>
    • toString

      public String toString()
      Overrides:
      toString in class Object
    • nulls

      public static VectorizedArrowReader nulls()
    • positions

      public static VectorizedArrowReader positions()
    • positionsWithSetArrowValidityVector

      public static VectorizedArrowReader positionsWithSetArrowValidityVector()