Class VectorizedPageIterator
java.lang.Object
org.apache.iceberg.parquet.BasePageIterator
org.apache.iceberg.arrow.vectorized.parquet.VectorizedPageIterator
-
Nested Class Summary
Nested classes/interfaces inherited from class org.apache.iceberg.parquet.BasePageIterator
BasePageIterator.IntIterator
-
Field Summary
Fields inherited from class org.apache.iceberg.parquet.BasePageIterator
currentDL, currentRL, definitionLevels, desc, dictionary, hasNext, page, repetitionLevels, triplesCount, triplesRead, valueEncoding, values, writerVersion
-
Constructor Summary
ConstructorDescriptionVectorizedPageIterator
(org.apache.parquet.column.ColumnDescriptor desc, String writerVersion, boolean setValidityVector) -
Method Summary
Modifier and TypeMethodDescriptionprotected void
initDataReader
(org.apache.parquet.column.Encoding dataEncoding, org.apache.parquet.bytes.ByteBufferInputStream in, int valueCount) protected void
initDefinitionLevelsReader
(org.apache.parquet.column.page.DataPageV1 dataPageV1, org.apache.parquet.column.ColumnDescriptor desc, org.apache.parquet.bytes.ByteBufferInputStream in, int triplesCount) protected void
initDefinitionLevelsReader
(org.apache.parquet.column.page.DataPageV2 dataPageV2, org.apache.parquet.column.ColumnDescriptor desc) int
nextBatchDictionaryIds
(org.apache.arrow.vector.IntVector vector, int expectedBatchSize, int numValsInVector, NullabilityHolder holder) Method for reading a batch of dictionary ids from the dictionary encoded data pages.boolean
protected void
reset()
void
setAllPagesDictEncoded
(boolean allDictEncoded) Methods inherited from class org.apache.iceberg.parquet.BasePageIterator
currentPageCount, hasNext, initFromPage, initFromPage, initRepetitionLevelsReader, initRepetitionLevelsReader, setDictionary, setPage
-
Constructor Details
-
VectorizedPageIterator
public VectorizedPageIterator(org.apache.parquet.column.ColumnDescriptor desc, String writerVersion, boolean setValidityVector)
-
-
Method Details
-
setAllPagesDictEncoded
public void setAllPagesDictEncoded(boolean allDictEncoded) -
reset
protected void reset()- Overrides:
reset
in classBasePageIterator
-
initDataReader
protected void initDataReader(org.apache.parquet.column.Encoding dataEncoding, org.apache.parquet.bytes.ByteBufferInputStream in, int valueCount) - Specified by:
initDataReader
in classBasePageIterator
-
producesDictionaryEncodedVector
public boolean producesDictionaryEncodedVector() -
initDefinitionLevelsReader
protected void initDefinitionLevelsReader(org.apache.parquet.column.page.DataPageV1 dataPageV1, org.apache.parquet.column.ColumnDescriptor desc, org.apache.parquet.bytes.ByteBufferInputStream in, int triplesCount) throws IOException - Specified by:
initDefinitionLevelsReader
in classBasePageIterator
- Throws:
IOException
-
initDefinitionLevelsReader
protected void initDefinitionLevelsReader(org.apache.parquet.column.page.DataPageV2 dataPageV2, org.apache.parquet.column.ColumnDescriptor desc) throws IOException - Specified by:
initDefinitionLevelsReader
in classBasePageIterator
- Throws:
IOException
-
nextBatchDictionaryIds
public int nextBatchDictionaryIds(org.apache.arrow.vector.IntVector vector, int expectedBatchSize, int numValsInVector, NullabilityHolder holder) Method for reading a batch of dictionary ids from the dictionary encoded data pages. Like definition levels, dictionary ids in Parquet are RLE/bin-packed encoded as well.
-