[SERVER-43944] Accumulators (and possibly expressions) can allocate large amounts of memory Created: 10/Oct/19  Updated: 28/Aug/23

Status: Backlog
Project: Core Server
Component/s: Aggregation Framework
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Charlie Swanson Assignee: Backlog - Query Execution
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to SERVER-44174 $push and $addToSet should restrict m... Closed
Assigned Teams:
Query Execution
Operating System: ALL
Participants:
Case:

 Description   

Generally every stage which builds some large data structure (think $sort or $group) will use estimated sizes to cap that memory usage to 100MB. Nothing like this exists for expressions or accumulators, but something like a $push can end up allocating lots of memory.

For example, the pipeline:

[{$group: {_id: null, all: {$push: "$$ROOT"}}}]

will push everything to one giant array. This actually will error if spilling to disk is not enabled, since $group will approximate the size of all accumulators. But once we spill to disk and then begin merging back together there is no memory usage tracking. Because $push tracks just a vector<Value>, that array itself isn't going to grow very large (each Value is just 16 bytes). However each of those Values indicates a sub-tree of memory that can add up to lots and lots of memory in the worst case.

SERVER-44174 tracks the work to fix $push and $addToSet, while this ticket tracks the work to create a more general solution for all expressions and accumulators.


Generated at Thu Feb 08 05:04:32 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.