Package org.apache.iceberg
Class PartitionStatsHandler
java.lang.Object
org.apache.iceberg.PartitionStatsHandler
Computes, writes and reads the
PartitionStatisticsFile. Uses generic readers and writers
to support writing and reading of the stats in table default format.-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final Types.NestedFieldDeprecated.will be removed in 1.12.0.static final Types.NestedFieldDeprecated.will be removed in 1.12.0.static final Types.NestedFieldDeprecated.will be removed in 1.12.0.static final Types.NestedFieldDeprecated.will be removed in 1.12.0.static final Types.NestedFieldDeprecated.will be removed in 1.12.0.static final Types.NestedFieldDeprecated.will be removed in 1.12.0.static final Types.NestedFieldDeprecated.will be removed in 1.12.0.static final intDeprecated.will be removed in 1.12.0.static final StringDeprecated.will be removed in 1.12.0.static final Types.NestedFieldDeprecated.will be removed in 1.12.0.static final Types.NestedFieldDeprecated.will be removed in 1.12.0.static final Types.NestedFieldDeprecated.will be removed in 1.12.0.static final Types.NestedFieldDeprecated.will be removed in 1.12.0.static final Types.NestedFieldDeprecated.will be removed in 1.12.0. -
Method Summary
Modifier and TypeMethodDescriptionstatic PartitionStatisticsFilecomputeAndWriteStatsFile(Table table) Computes the stats incrementally after the snapshot that has partition stats file till the current snapshot and writes the combined result into aPartitionStatisticsFileafter merging the stats for a given table's current snapshot.static PartitionStatisticsFilecomputeAndWriteStatsFile(Table table, long snapshotId) Computes the stats incrementally after the snapshot that has partition stats file till the given snapshot and writes the combined result into aPartitionStatisticsFileafter merging the stats for a given snapshot.static CloseableIterable<PartitionStats> readPartitionStatsFile(Schema schema, InputFile inputFile) Deprecated.will be removed in 1.12.0, usePartitionStatisticsScaninsteadstatic Schemaschema(Types.StructType unifiedPartitionType, int formatVersion) Deprecated.will be removed in 1.12.0.
-
Field Details
-
PARTITION_FIELD_ID
Deprecated.will be removed in 1.12.0. UsePartitionStatistics.EMPTY_PARTITION_FIELD- See Also:
-
PARTITION_FIELD_NAME
Deprecated.will be removed in 1.12.0. UsePartitionStatistics.EMPTY_PARTITION_FIELD- See Also:
-
SPEC_ID
Deprecated.will be removed in 1.12.0. UsePartitionStatistics.SPEC_ID -
DATA_RECORD_COUNT
Deprecated.will be removed in 1.12.0. UsePartitionStatistics.DATA_RECORD_COUNT -
DATA_FILE_COUNT
Deprecated.will be removed in 1.12.0. UsePartitionStatistics.DATA_FILE_COUNT -
TOTAL_DATA_FILE_SIZE_IN_BYTES
Deprecated.will be removed in 1.12.0. UsePartitionStatistics.TOTAL_DATA_FILE_SIZE_IN_BYTES -
POSITION_DELETE_RECORD_COUNT
Deprecated.will be removed in 1.12.0. UsePartitionStatistics.POSITION_DELETE_RECORD_COUNT -
POSITION_DELETE_FILE_COUNT
Deprecated.will be removed in 1.12.0. UsePartitionStatistics.POSITION_DELETE_FILE_COUNT -
EQUALITY_DELETE_RECORD_COUNT
Deprecated.will be removed in 1.12.0. UsePartitionStatistics.EQUALITY_DELETE_RECORD_COUNT -
EQUALITY_DELETE_FILE_COUNT
Deprecated.will be removed in 1.12.0. UsePartitionStatistics.EQUALITY_DELETE_FILE_COUNT -
TOTAL_RECORD_COUNT
Deprecated.will be removed in 1.12.0. UsePartitionStatistics.TOTAL_RECORD_COUNT -
LAST_UPDATED_AT
Deprecated.will be removed in 1.12.0. UsePartitionStatistics.LAST_UPDATED_AT -
LAST_UPDATED_SNAPSHOT_ID
Deprecated.will be removed in 1.12.0. UsePartitionStatistics.LAST_UPDATED_SNAPSHOT_ID -
DV_COUNT
Deprecated.will be removed in 1.12.0. UsePartitionStatistics.DV_COUNT
-
-
Method Details
-
schema
Deprecated.will be removed in 1.12.0. UsePartitionStatistics.schema(StructType, int)instead.Generates the partition stats file schema for a given format version based on a combined partition type which considers all specs in a table.- Parameters:
unifiedPartitionType- unified partition schema type. Could be calculated byPartitioning.partitionType(Table).- Returns:
- a schema that corresponds to the provided unified partition type.
-
computeAndWriteStatsFile
Computes the stats incrementally after the snapshot that has partition stats file till the current snapshot and writes the combined result into aPartitionStatisticsFileafter merging the stats for a given table's current snapshot.Does a full compute if previous statistics file does not exist.
- Parameters:
table- TheTablefor which the partition statistics is computed.- Returns:
PartitionStatisticsFilefor the current snapshot, or null if no statistics are present.- Throws:
IOException
-
computeAndWriteStatsFile
public static PartitionStatisticsFile computeAndWriteStatsFile(Table table, long snapshotId) throws IOException Computes the stats incrementally after the snapshot that has partition stats file till the given snapshot and writes the combined result into aPartitionStatisticsFileafter merging the stats for a given snapshot.Does a full compute if previous statistics file does not exist.
- Parameters:
table- TheTablefor which the partition statistics is computed.snapshotId- snapshot for which partition statistics are computed.- Returns:
PartitionStatisticsFilefor the given snapshot, or null if no statistics are present.- Throws:
IOException
-
readPartitionStatsFile
@Deprecated public static CloseableIterable<PartitionStats> readPartitionStatsFile(Schema schema, InputFile inputFile) Deprecated.will be removed in 1.12.0, usePartitionStatisticsScaninsteadReads partition statistics from the specifiedInputFileusing given schema.
-