-
Type: Improvement
-
Resolution: Unresolved
-
Priority: Minor - P4
-
None
-
Affects Version/s: None
-
Component/s: None
-
Atlas Streams
Consider a pipeline with a `hoppingWindow`, like this one:
pipeline: [{ $group: { _id:"$customerId", customerDocs: {$push:"$$ROOT"}, } }]
Say there are 200 open windows. Now, an document will be absorbed into all these open windows. And, though the logical state size is O(200docs), the actual memory usage will still just be 1 doc since documents are cheaply copyable via ref-counting etc
Now, when such a state is checkpointed and recovered, we lose this sharing info and so today we will end up with 200 different docs after the recovery.
This causes ballooning in memory usage after the checkpoint has been recovered.