Class DefaultSplitAssigner

java.lang.Object
org.apache.iceberg.flink.source.assigner.DefaultSplitAssigner
All Implemented Interfaces:
Closeable, AutoCloseable, SplitAssigner

@Internal public class DefaultSplitAssigner extends Object implements SplitAssigner
Since all methods are called in the source coordinator thread by enumerator, there is no need for locking.
  • Constructor Details

  • Method Details

    • getNext

      public GetSplitResult getNext(@Nullable String hostname)
      Description copied from interface: SplitAssigner
      Request a new split from the assigner when enumerator trying to assign splits to awaiting readers.

      If enumerator wasn't able to assign the split (e.g., reader disconnected), enumerator should call SplitAssigner.onUnassignedSplits(java.util.Collection<org.apache.iceberg.flink.source.split.IcebergSourceSplit>) to return the split.

      Specified by:
      getNext in interface SplitAssigner
    • onDiscoveredSplits

      public void onDiscoveredSplits(Collection<IcebergSourceSplit> splits)
      Description copied from interface: SplitAssigner
      Add new splits discovered by enumerator
      Specified by:
      onDiscoveredSplits in interface SplitAssigner
    • onUnassignedSplits

      public void onUnassignedSplits(Collection<IcebergSourceSplit> splits)
      Description copied from interface: SplitAssigner
      Forward addSplitsBack event (for failed reader) to assigner
      Specified by:
      onUnassignedSplits in interface SplitAssigner
    • state

      Simple assigner only tracks unassigned splits
      Specified by:
      state in interface SplitAssigner
    • isAvailable

      public CompletableFuture<Void> isAvailable()
      Description copied from interface: SplitAssigner
      Enumerator can get a notification via CompletableFuture when the assigner has more splits available later. Enumerator should schedule assignment in the thenAccept action of the future.

      Assigner will return the same future if this method is called again before the previous future is completed.

      The future can be completed from other thread, e.g. the coordinator thread from another thread for event time alignment.

      If enumerator need to trigger action upon the future completion, it may want to run it in the coordinator thread using SplitEnumeratorContext.runInCoordinatorThread(Runnable).

      Specified by:
      isAvailable in interface SplitAssigner
    • pendingSplitCount

      public int pendingSplitCount()
      Description copied from interface: SplitAssigner
      Return the number of pending splits that haven't been assigned yet.

      The enumerator can poll this API to publish a metric on the number of pending splits.

      The enumerator can also use this information to throttle split discovery for streaming read. If there are already many pending splits tracked by the assigner, it is undesirable to discover more splits and track them in the assigner. That will increase the memory footprint and enumerator checkpoint size.

      Throttling works better together with ScanContext.maxPlanningSnapshotCount(). Otherwise, the next split discovery after throttling will just discover all non-enumerated snapshots and splits, which defeats the purpose of throttling.

      Specified by:
      pendingSplitCount in interface SplitAssigner
    • pendingRecords

      public long pendingRecords()
      Description copied from interface: SplitAssigner
      Return the number of pending records, which can act as a measure of the source lag. This value could be an estimation if the exact number of records cannot be accurately computed.
      Specified by:
      pendingRecords in interface SplitAssigner