Package org.apache.iceberg
Class PartitionStatsHandler
java.lang.Object
org.apache.iceberg.PartitionStatsHandler
Computes, writes and reads the 
PartitionStatisticsFile. Uses generic readers and writers
 to support writing and reading of the stats in table default format.- 
Field SummaryFieldsModifier and TypeFieldDescriptionstatic final Types.NestedFieldstatic final Types.NestedFieldstatic final Types.NestedFieldstatic final Types.NestedFieldstatic final Types.NestedFieldstatic final Types.NestedFieldstatic final Types.NestedFieldstatic final intstatic final Stringstatic final Types.NestedFieldstatic final Types.NestedFieldstatic final Types.NestedFieldstatic final Types.NestedFieldstatic final Types.NestedField
- 
Method SummaryModifier and TypeMethodDescriptionstatic PartitionStatisticsFilecomputeAndWriteStatsFile(Table table) Computes the stats incrementally after the snapshot that has partition stats file till the current snapshot and writes the combined result into aPartitionStatisticsFileafter merging the stats for a given table's current snapshot.static PartitionStatisticsFilecomputeAndWriteStatsFile(Table table, long snapshotId) Computes the stats incrementally after the snapshot that has partition stats file till the given snapshot and writes the combined result into aPartitionStatisticsFileafter merging the stats for a given snapshot.static CloseableIterable<PartitionStats> readPartitionStatsFile(Schema schema, InputFile inputFile) Reads partition statistics from the specifiedInputFileusing given schema.static Schemaschema(Types.StructType unifiedPartitionType) Deprecated.since 1.10.0, will be removed in 1.11.0.static Schemaschema(Types.StructType unifiedPartitionType, int formatVersion) Generates the partition stats file schema for a given format version based on a combined partition type which considers all specs in a table.
- 
Field Details- 
PARTITION_FIELD_IDpublic static final int PARTITION_FIELD_ID- See Also:
 
- 
PARTITION_FIELD_NAME- See Also:
 
- 
SPEC_ID
- 
DATA_RECORD_COUNT
- 
DATA_FILE_COUNT
- 
TOTAL_DATA_FILE_SIZE_IN_BYTES
- 
POSITION_DELETE_RECORD_COUNT
- 
POSITION_DELETE_FILE_COUNT
- 
EQUALITY_DELETE_RECORD_COUNT
- 
EQUALITY_DELETE_FILE_COUNT
- 
TOTAL_RECORD_COUNT
- 
LAST_UPDATED_AT
- 
LAST_UPDATED_SNAPSHOT_ID
- 
DV_COUNT
 
- 
- 
Method Details- 
schemaDeprecated.since 1.10.0, will be removed in 1.11.0. Useschema(StructType, int)instead.Generates the partition stats file schema based on a combined partition type which considers all specs in a table.Use this only for format version 1 and 2. For version 3 and above use schema(StructType, int)- Parameters:
- unifiedPartitionType- unified partition schema type. Could be calculated by- Partitioning.partitionType(Table).
- Returns:
- a schema that corresponds to the provided unified partition type.
 
- 
schemaGenerates the partition stats file schema for a given format version based on a combined partition type which considers all specs in a table.- Parameters:
- unifiedPartitionType- unified partition schema type. Could be calculated by- Partitioning.partitionType(Table).
- Returns:
- a schema that corresponds to the provided unified partition type.
 
- 
computeAndWriteStatsFileComputes the stats incrementally after the snapshot that has partition stats file till the current snapshot and writes the combined result into aPartitionStatisticsFileafter merging the stats for a given table's current snapshot.Does a full compute if previous statistics file does not exist. - Parameters:
- table- The- Tablefor which the partition statistics is computed.
- Returns:
- PartitionStatisticsFilefor the current snapshot, or null if no statistics are present.
- Throws:
- IOException
 
- 
computeAndWriteStatsFilepublic static PartitionStatisticsFile computeAndWriteStatsFile(Table table, long snapshotId) throws IOException Computes the stats incrementally after the snapshot that has partition stats file till the given snapshot and writes the combined result into aPartitionStatisticsFileafter merging the stats for a given snapshot.Does a full compute if previous statistics file does not exist. - Parameters:
- table- The- Tablefor which the partition statistics is computed.
- snapshotId- snapshot for which partition statistics are computed.
- Returns:
- PartitionStatisticsFilefor the given snapshot, or null if no statistics are present.
- Throws:
- IOException
 
- 
readPartitionStatsFilepublic static CloseableIterable<PartitionStats> readPartitionStatsFile(Schema schema, InputFile inputFile) Reads partition statistics from the specifiedInputFileusing given schema.
 
-