Class BaseVectorizedParquetValuesReader

  • Direct Known Subclasses:
    VectorizedDictionaryEncodedParquetValuesReader, VectorizedParquetDefinitionLevelReader

    public class BaseVectorizedParquetValuesReader
    extends org.apache.parquet.column.values.ValuesReader
    A values reader for Parquet's run-length encoded data that reads column data in batches instead of one value at a time. This is based off of the VectorizedRleValuesReader class in Apache Spark with these changes:

    Writes batches of values retrieved to Arrow vectors. If all pages of a column within the row group are not dictionary encoded, then dictionary ids are eagerly decoded into actual values before writing them to the Arrow vectors

    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      void initFromPage​(int valueCount, org.apache.parquet.bytes.ByteBufferInputStream in)  
      boolean readBoolean()  
      int readInteger()  
      int readValueDictionaryId()  
      void skip()  
      • Methods inherited from class org.apache.parquet.column.values.ValuesReader

        getNextOffset, initFromPage, initFromPage, readBytes, readDouble, readFloat, readLong, skip, updateNextOffset
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • BaseVectorizedParquetValuesReader

        public BaseVectorizedParquetValuesReader​(int maxDefLevel,
                                                 boolean setValidityVector)
      • BaseVectorizedParquetValuesReader

        public BaseVectorizedParquetValuesReader​(int bitWidth,
                                                 int maxDefLevel,
                                                 boolean setValidityVector)
      • BaseVectorizedParquetValuesReader

        public BaseVectorizedParquetValuesReader​(int bitWidth,
                                                 int maxDefLevel,
                                                 boolean readLength,
                                                 boolean setValidityVector)
    • Method Detail

      • initFromPage

        public void initFromPage​(int valueCount,
                                 org.apache.parquet.bytes.ByteBufferInputStream in)
        initFromPage in class org.apache.parquet.column.values.ValuesReader
      • readBoolean

        public boolean readBoolean()
        readBoolean in class org.apache.parquet.column.values.ValuesReader
      • skip

        public void skip()
        Specified by:
        skip in class org.apache.parquet.column.values.ValuesReader
      • readValueDictionaryId

        public int readValueDictionaryId()
        readValueDictionaryId in class org.apache.parquet.column.values.ValuesReader
      • readInteger

        public int readInteger()
        readInteger in class org.apache.parquet.column.values.ValuesReader