Package org.apache.iceberg.parquet
Class ParquetUtil
java.lang.Object
org.apache.iceberg.parquet.ParquetUtil
-
Method Summary
Modifier and TypeMethodDescriptionstatic long
extractTimestampInt96
(ByteBuffer buffer) Method to read timestamp (parquet Int96) from bytebuffer.static Metrics
fileMetrics
(InputFile file, MetricsConfig metricsConfig) static Metrics
fileMetrics
(InputFile file, MetricsConfig metricsConfig, NameMapping nameMapping) static Metrics
footerMetrics
(org.apache.parquet.hadoop.metadata.ParquetMetadata metadata, Stream<FieldMetrics<?>> fieldMetrics, MetricsConfig metricsConfig) static Metrics
footerMetrics
(org.apache.parquet.hadoop.metadata.ParquetMetadata metadata, Stream<FieldMetrics<?>> fieldMetrics, MetricsConfig metricsConfig, NameMapping nameMapping) getSplitOffsets
(org.apache.parquet.hadoop.metadata.ParquetMetadata md) Returns a list of offsets in ascending order determined by the starting position of the row groups.static boolean
hasNoBloomFilterPages
(org.apache.parquet.hadoop.metadata.ColumnChunkMetaData meta) static boolean
hasNonDictionaryPages
(org.apache.parquet.hadoop.metadata.ColumnChunkMetaData meta) static boolean
isIntType
(org.apache.parquet.schema.PrimitiveType primitiveType) static org.apache.parquet.column.Dictionary
readDictionary
(org.apache.parquet.column.ColumnDescriptor desc, org.apache.parquet.column.page.PageReader pageSource)
-
Method Details
-
fileMetrics
-
fileMetrics
public static Metrics fileMetrics(InputFile file, MetricsConfig metricsConfig, NameMapping nameMapping) -
getSplitOffsets
Returns a list of offsets in ascending order determined by the starting position of the row groups. -
hasNonDictionaryPages
public static boolean hasNonDictionaryPages(org.apache.parquet.hadoop.metadata.ColumnChunkMetaData meta) -
hasNoBloomFilterPages
public static boolean hasNoBloomFilterPages(org.apache.parquet.hadoop.metadata.ColumnChunkMetaData meta) -
readDictionary
public static org.apache.parquet.column.Dictionary readDictionary(org.apache.parquet.column.ColumnDescriptor desc, org.apache.parquet.column.page.PageReader pageSource) -
isIntType
public static boolean isIntType(org.apache.parquet.schema.PrimitiveType primitiveType) -
extractTimestampInt96
Method to read timestamp (parquet Int96) from bytebuffer. Read 12 bytes in byteBuffer: 8 bytes (time of day nanos) + 4 bytes(julianDay)
-