Class DataStatisticsOperator

java.lang.Object
org.apache.flink.streaming.api.operators.AbstractStreamOperator<StatisticsOrRecord>
org.apache.iceberg.flink.sink.shuffle.DataStatisticsOperator
All Implemented Interfaces:
Serializable, org.apache.flink.api.common.state.CheckpointListener, org.apache.flink.runtime.operators.coordination.OperatorEventHandler, org.apache.flink.streaming.api.operators.Input<org.apache.flink.table.data.RowData>, org.apache.flink.streaming.api.operators.KeyContext, org.apache.flink.streaming.api.operators.KeyContextHandler, org.apache.flink.streaming.api.operators.OneInputStreamOperator<org.apache.flink.table.data.RowData,StatisticsOrRecord>, org.apache.flink.streaming.api.operators.SetupableStreamOperator<StatisticsOrRecord>, org.apache.flink.streaming.api.operators.StreamOperator<StatisticsOrRecord>, org.apache.flink.streaming.api.operators.StreamOperatorStateHandler.CheckpointedStreamOperator, org.apache.flink.streaming.api.operators.YieldingOperator<StatisticsOrRecord>

@Internal public class DataStatisticsOperator extends org.apache.flink.streaming.api.operators.AbstractStreamOperator<StatisticsOrRecord> implements org.apache.flink.streaming.api.operators.OneInputStreamOperator<org.apache.flink.table.data.RowData,StatisticsOrRecord>, org.apache.flink.runtime.operators.coordination.OperatorEventHandler
DataStatisticsOperator collects traffic distribution statistics. A custom partitioner shall be attached to the DataStatisticsOperator output. The custom partitioner leverages the statistics to shuffle record to improve data clustering while maintaining relative balanced traffic distribution to downstream subtasks.
See Also:
  • Field Summary

    Fields inherited from class org.apache.flink.streaming.api.operators.AbstractStreamOperator

    chainingStrategy, config, lastRecordAttributes1, lastRecordAttributes2, latencyStats, LOG, metrics, output, processingTimeService, stateHandler, stateKeySelector1, stateKeySelector2, timeServiceManager
  • Method Summary

    Modifier and Type
    Method
    Description
    void
    handleOperatorEvent(org.apache.flink.runtime.operators.coordination.OperatorEvent event)
     
    void
    initializeState(org.apache.flink.runtime.state.StateInitializationContext context)
     
    void
     
    void
    processElement(org.apache.flink.streaming.runtime.streamrecord.StreamRecord<org.apache.flink.table.data.RowData> streamRecord)
     
    void
    snapshotState(org.apache.flink.runtime.state.StateSnapshotContext context)
     

    Methods inherited from class org.apache.flink.streaming.api.operators.AbstractStreamOperator

    close, finish, getChainingStrategy, getContainingTask, getCurrentKey, getExecutionConfig, getInternalTimerService, getKeyedStateBackend, getKeyedStateStore, getMetricGroup, getOperatorConfig, getOperatorID, getOperatorName, getOperatorStateBackend, getOrCreateKeyedState, getPartitionedState, getPartitionedState, getProcessingTimeService, getRuntimeContext, getStateKeySelector1, getStateKeySelector2, getTimeServiceManager, getUserCodeClassloader, hasKeyContext1, hasKeyContext2, initializeState, isUsingCustomRawKeyedState, notifyCheckpointAborted, notifyCheckpointComplete, prepareSnapshotPreBarrier, processLatencyMarker, processLatencyMarker1, processLatencyMarker2, processRecordAttributes, processRecordAttributes1, processRecordAttributes2, processWatermark, processWatermark1, processWatermark2, processWatermarkStatus, processWatermarkStatus1, processWatermarkStatus2, reportOrForwardLatencyMarker, setChainingStrategy, setCurrentKey, setKeyContextElement1, setKeyContextElement2, setMailboxExecutor, setProcessingTimeService, setup, snapshotState, useSplittableTimers

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

    Methods inherited from interface org.apache.flink.api.common.state.CheckpointListener

    notifyCheckpointAborted, notifyCheckpointComplete

    Methods inherited from interface org.apache.flink.streaming.api.operators.Input

    processLatencyMarker, processRecordAttributes, processWatermark, processWatermarkStatus

    Methods inherited from interface org.apache.flink.streaming.api.operators.KeyContext

    getCurrentKey, setCurrentKey

    Methods inherited from interface org.apache.flink.streaming.api.operators.KeyContextHandler

    hasKeyContext

    Methods inherited from interface org.apache.flink.streaming.api.operators.OneInputStreamOperator

    setKeyContextElement

    Methods inherited from interface org.apache.flink.streaming.api.operators.StreamOperator

    close, finish, getMetricGroup, getOperatorAttributes, getOperatorID, initializeState, prepareSnapshotPreBarrier, setKeyContextElement1, setKeyContextElement2, snapshotState
  • Method Details

    • initializeState

      public void initializeState(org.apache.flink.runtime.state.StateInitializationContext context) throws Exception
      Specified by:
      initializeState in interface org.apache.flink.streaming.api.operators.StreamOperatorStateHandler.CheckpointedStreamOperator
      Overrides:
      initializeState in class org.apache.flink.streaming.api.operators.AbstractStreamOperator<StatisticsOrRecord>
      Throws:
      Exception
    • open

      public void open() throws Exception
      Specified by:
      open in interface org.apache.flink.streaming.api.operators.StreamOperator<StatisticsOrRecord>
      Overrides:
      open in class org.apache.flink.streaming.api.operators.AbstractStreamOperator<StatisticsOrRecord>
      Throws:
      Exception
    • handleOperatorEvent

      public void handleOperatorEvent(org.apache.flink.runtime.operators.coordination.OperatorEvent event)
      Specified by:
      handleOperatorEvent in interface org.apache.flink.runtime.operators.coordination.OperatorEventHandler
    • processElement

      public void processElement(org.apache.flink.streaming.runtime.streamrecord.StreamRecord<org.apache.flink.table.data.RowData> streamRecord)
      Specified by:
      processElement in interface org.apache.flink.streaming.api.operators.Input<org.apache.flink.table.data.RowData>
    • snapshotState

      public void snapshotState(org.apache.flink.runtime.state.StateSnapshotContext context) throws Exception
      Specified by:
      snapshotState in interface org.apache.flink.streaming.api.operators.StreamOperatorStateHandler.CheckpointedStreamOperator
      Overrides:
      snapshotState in class org.apache.flink.streaming.api.operators.AbstractStreamOperator<StatisticsOrRecord>
      Throws:
      Exception