[DOCS-16193] [SERVER] Mention non-guaranteed order of $accumulator Created: 09/Jun/23 Updated: 22/Jan/24 |
|
| Status: | Backlog |
| Project: | Documentation |
| Component/s: | manual, Server |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Minor - P4 |
| Reporter: | David Percy | Assignee: | Unassigned |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | backlog, request | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Participants: | |
| Days since reply: | 34 weeks, 2 days ago |
| Description |
|
The $accumulator operator lets you define a custom accumulator using Javascript: https://www.mongodb.com/docs/v6.0/reference/operator/aggregation/accumulator/ Part of the contract between the user and the server is that the server is free to decide the order and grouping when it calls init()/accumulate()/merge(), and so the user is responsible for making sure these functions are insensitive to order and grouping. We do allude to this, because we document the conditions when merge() is called: https://www.mongodb.com/docs/v6.0/reference/operator/aggregation/accumulator/#merge-two-states-with--merge. But maybe we should be more explicit about the assumptions the server makes about the user's init()/accumulate()/merge() functions. For example, here's an example of a bad, grouping-sensitive $accumulator:
This accumulator is bad because it gives you a different answer depending on how the server chooses to do the grouping:
If you think something precise would be useful, I think this captures it:
|
| Comments |
| Comment by Sarah Olson [ 12/Jun/23 ] |
|
Thanks david.percy@mongodb.com. We'll take a look as part of our backlog. |
| Comment by David Percy [ 09/Jun/23 ] |
|
alya.berciu@mongodb.com let me know if this is helpful. Rereading the Slack thread which prompted this:
I think that's a fair description of the behavior today, but not something we guarantee, and not something users should try to predict. For example (just thinking now) in the future if we had more intraquery parallelism by default, that would be a new reason the server might choose to keep several states. |