Class VectorizedPageIterator

java.lang.Object
org.apache.iceberg.parquet.BasePageIterator
org.apache.iceberg.arrow.vectorized.parquet.VectorizedPageIterator

public class VectorizedPageIterator extends BasePageIterator
  • Constructor Details

    • VectorizedPageIterator

      public VectorizedPageIterator(org.apache.parquet.column.ColumnDescriptor desc, String writerVersion, boolean setValidityVector)
  • Method Details

    • setAllPagesDictEncoded

      public void setAllPagesDictEncoded(boolean allDictEncoded)
    • reset

      protected void reset()
      Overrides:
      reset in class BasePageIterator
    • initDataReader

      protected void initDataReader(org.apache.parquet.column.Encoding dataEncoding, org.apache.parquet.bytes.ByteBufferInputStream in, int valueCount)
      Specified by:
      initDataReader in class BasePageIterator
    • producesDictionaryEncodedVector

      public boolean producesDictionaryEncodedVector()
    • initDefinitionLevelsReader

      protected void initDefinitionLevelsReader(org.apache.parquet.column.page.DataPageV1 dataPageV1, org.apache.parquet.column.ColumnDescriptor desc, org.apache.parquet.bytes.ByteBufferInputStream in, int triplesCount) throws IOException
      Specified by:
      initDefinitionLevelsReader in class BasePageIterator
      Throws:
      IOException
    • initDefinitionLevelsReader

      protected void initDefinitionLevelsReader(org.apache.parquet.column.page.DataPageV2 dataPageV2, org.apache.parquet.column.ColumnDescriptor desc) throws IOException
      Specified by:
      initDefinitionLevelsReader in class BasePageIterator
      Throws:
      IOException
    • nextBatchDictionaryIds

      public int nextBatchDictionaryIds(org.apache.arrow.vector.IntVector vector, int expectedBatchSize, int numValsInVector, NullabilityHolder holder)
      Method for reading a batch of dictionary ids from the dictionary encoded data pages. Like definition levels, dictionary ids in Parquet are RLE/bin-packed encoded as well.