Details
-
Task
-
Resolution: Fixed
-
Major - P3
-
None
-
None
-
None
-
Atlas Streams
-
Fully Compatible
-
Sprint 30, Sprint 31
Description
Watermark alignment is to prevent a large drift in watermarks across sources. This is only relevant for kafka right now, since we maintain a separate watermark per partition, so this will ensure that the min and max watermarks in the kafka topic partition set never drifts for more than the configured max drift. If a kafka partition is getting consumed significantly faster than another kafka partition, and the watermark drift exceeds the max drift, then all kafka partitions with a watermark that exceeds (min_watermark + max_drift) will be temporarily paused until the partition thats falling behind can catch back up.
This is mostly just relevant to windowed pipelines to prevent the open window state from growing indefinitely because the combined watermark can proceed b/c one partition is falling behind and another partition is way faster.