-
Type:
Improvement
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
Query Execution
-
None
-
None
-
None
-
None
-
None
-
None
-
None
It is not currently possible to retain state across documents in aggregation pipelines. This means that computations such as exponential moving averages can only be done using specialized operators. Another example are algorithms such as LTTB, where the result at position N depends on the result that was computed at position N-1.
MongoDB shares this limitation with the declarative nature of SQL, where it is extremely cumbersome to express something like exponential moving averages correctly:
WITH RECURSIVE ordered AS ( SELECT value, date, ROW_NUMBER() OVER (ORDER BY date) AS rn FROM measurements ), ema AS ( -- Base case: seed with the first value SELECT rn, date, value, value AS ema_value FROM ordered WHERE rn = 1 UNION ALL -- Recursive step: apply the EMA formula SELECT o.rn, o.date, o.value, 0.2 * o.value + 0.8 * e.ema_value FROM ema e JOIN ordered o ON o.rn = e.rn + 1 ) SELECT date, value, ema_value FROM ema ORDER BY date;
On the other hand, in a step-oriented, procedural environment like SAS, this is extremely simple:
data result; set measurements; retain ema_value; if _N_ = 1 then ema_value = value; else ema_value = 0.2 * value + 0.8 * ema_value; run;
I would argue that MongoDB's aggregation pipelines are closer to this procedural approach than declarative SQL. To double down on this difference, and the benefits of the procedural approach, it should be possible to retain state across documents, similar to the above example in SAS.
- is related to
-
SERVER-29339 allow using $reduce expression as accumulator in $group
-
- Backlog
-