Enum Class StatisticsType

java.lang.Object
java.lang.Enum<StatisticsType>
org.apache.iceberg.flink.sink.shuffle.StatisticsType
All Implemented Interfaces:
Serializable, Comparable<StatisticsType>, Constable

public enum StatisticsType extends Enum<StatisticsType>
Range distribution requires gathering statistics on the sort keys to determine proper range boundaries to distribute/cluster rows before writer operators.
  • Enum Constant Details

    • Map

      public static final StatisticsType Map
      Tracks the data statistics as Map<SortKey, Long> frequency. It works better for low-cardinality scenarios (like country, event_type, etc.) where the cardinalities are in hundreds or thousands.
      • Pro: accurate measurement on the statistics/weight of every key.
      • Con: memory footprint can be large if the key cardinality is high.
    • Sketch

      public static final StatisticsType Sketch
      Sample the sort keys via reservoir sampling. Then split the range partitions via range bounds from sampled values. It works better for high-cardinality scenarios (like device_id, user_id, uuid etc.) where the cardinalities can be in millions or billions.
      • Pro: relatively low memory footprint for high-cardinality sort keys.
      • Con: non-precise approximation with potentially lower accuracy.
    • Auto

      public static final StatisticsType Auto
      Initially use Map for statistics tracking. If key cardinality turns out to be high, automatically switch to sketch sampling.
  • Method Details

    • values

      public static StatisticsType[] values()
      Returns an array containing the constants of this enum class, in the order they are declared.
      Returns:
      an array containing the constants of this enum class, in the order they are declared
    • valueOf

      public static StatisticsType valueOf(String name)
      Returns the enum constant of this class with the specified name. The string must match exactly an identifier used to declare an enum constant in this class. (Extraneous whitespace characters are not permitted.)
      Parameters:
      name - the name of the enum constant to be returned.
      Returns:
      the enum constant with the specified name
      Throws:
      IllegalArgumentException - if this enum class has no constant with the specified name
      NullPointerException - if the argument is null