Class BaseVectorizedParquetValuesReader

java.lang.Object
org.apache.parquet.column.values.ValuesReader
org.apache.iceberg.arrow.vectorized.parquet.BaseVectorizedParquetValuesReader
Direct Known Subclasses:
VectorizedDictionaryEncodedParquetValuesReader, VectorizedParquetDefinitionLevelReader

public class BaseVectorizedParquetValuesReader extends org.apache.parquet.column.values.ValuesReader
A values reader for Parquet's run-length encoded data that reads column data in batches instead of one value at a time. This is based off of the VectorizedRleValuesReader class in Apache Spark with these changes:

Writes batches of values retrieved to Arrow vectors. If all pages of a column within the row group are not dictionary encoded, then dictionary ids are eagerly decoded into actual values before writing them to the Arrow vectors

  • Constructor Details

    • BaseVectorizedParquetValuesReader

      public BaseVectorizedParquetValuesReader(int maxDefLevel, boolean setValidityVector)
    • BaseVectorizedParquetValuesReader

      public BaseVectorizedParquetValuesReader(int bitWidth, int maxDefLevel, boolean setValidityVector)
    • BaseVectorizedParquetValuesReader

      public BaseVectorizedParquetValuesReader(int bitWidth, int maxDefLevel, boolean readLength, boolean setValidityVector)
  • Method Details

    • initFromPage

      public void initFromPage(int valueCount, org.apache.parquet.bytes.ByteBufferInputStream in) throws IOException
      Overrides:
      initFromPage in class org.apache.parquet.column.values.ValuesReader
      Throws:
      IOException
    • readBoolean

      public boolean readBoolean()
      Overrides:
      readBoolean in class org.apache.parquet.column.values.ValuesReader
    • skip

      public void skip()
      Specified by:
      skip in class org.apache.parquet.column.values.ValuesReader
    • readValueDictionaryId

      public int readValueDictionaryId()
      Overrides:
      readValueDictionaryId in class org.apache.parquet.column.values.ValuesReader
    • readInteger

      public int readInteger()
      Overrides:
      readInteger in class org.apache.parquet.column.values.ValuesReader