Class VectorizedPageIterator
- java.lang.Object
-
- org.apache.iceberg.parquet.BasePageIterator
-
- org.apache.iceberg.arrow.vectorized.parquet.VectorizedPageIterator
-
public class VectorizedPageIterator extends BasePageIterator
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class org.apache.iceberg.parquet.BasePageIterator
BasePageIterator.IntIterator
-
-
Field Summary
-
Fields inherited from class org.apache.iceberg.parquet.BasePageIterator
currentDL, currentRL, definitionLevels, desc, dictionary, hasNext, page, repetitionLevels, triplesCount, triplesRead, valueEncoding, values, writerVersion
-
-
Constructor Summary
Constructors Constructor Description VectorizedPageIterator(org.apache.parquet.column.ColumnDescriptor desc, java.lang.String writerVersion, boolean setValidityVector)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected void
initDataReader(org.apache.parquet.column.Encoding dataEncoding, org.apache.parquet.bytes.ByteBufferInputStream in, int valueCount)
protected void
initDefinitionLevelsReader(org.apache.parquet.column.page.DataPageV1 dataPageV1, org.apache.parquet.column.ColumnDescriptor desc, org.apache.parquet.bytes.ByteBufferInputStream in, int triplesCount)
protected void
initDefinitionLevelsReader(org.apache.parquet.column.page.DataPageV2 dataPageV2, org.apache.parquet.column.ColumnDescriptor desc)
int
nextBatchDictionaryIds(org.apache.arrow.vector.IntVector vector, int expectedBatchSize, int numValsInVector, NullabilityHolder holder)
Method for reading a batch of dictionary ids from the dictionary encoded data pages.boolean
producesDictionaryEncodedVector()
protected void
reset()
void
setAllPagesDictEncoded(boolean allDictEncoded)
-
Methods inherited from class org.apache.iceberg.parquet.BasePageIterator
currentPageCount, hasNext, initFromPage, initFromPage, initRepetitionLevelsReader, initRepetitionLevelsReader, setDictionary, setPage
-
-
-
-
Method Detail
-
setAllPagesDictEncoded
public void setAllPagesDictEncoded(boolean allDictEncoded)
-
reset
protected void reset()
- Overrides:
reset
in classBasePageIterator
-
initDataReader
protected void initDataReader(org.apache.parquet.column.Encoding dataEncoding, org.apache.parquet.bytes.ByteBufferInputStream in, int valueCount)
- Specified by:
initDataReader
in classBasePageIterator
-
producesDictionaryEncodedVector
public boolean producesDictionaryEncodedVector()
-
initDefinitionLevelsReader
protected void initDefinitionLevelsReader(org.apache.parquet.column.page.DataPageV1 dataPageV1, org.apache.parquet.column.ColumnDescriptor desc, org.apache.parquet.bytes.ByteBufferInputStream in, int triplesCount) throws java.io.IOException
- Specified by:
initDefinitionLevelsReader
in classBasePageIterator
- Throws:
java.io.IOException
-
initDefinitionLevelsReader
protected void initDefinitionLevelsReader(org.apache.parquet.column.page.DataPageV2 dataPageV2, org.apache.parquet.column.ColumnDescriptor desc) throws java.io.IOException
- Specified by:
initDefinitionLevelsReader
in classBasePageIterator
- Throws:
java.io.IOException
-
nextBatchDictionaryIds
public int nextBatchDictionaryIds(org.apache.arrow.vector.IntVector vector, int expectedBatchSize, int numValsInVector, NullabilityHolder holder)
Method for reading a batch of dictionary ids from the dictionary encoded data pages. Like definition levels, dictionary ids in Parquet are RLE/bin-packed encoded as well.
-
-