Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-43944

Accumulators (and possibly expressions) can allocate large amounts of memory

    • Type: Icon: Bug Bug
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Aggregation Framework
    • Labels:
    • Query Execution
    • ALL

      Generally every stage which builds some large data structure (think $sort or $group) will use estimated sizes to cap that memory usage to 100MB. Nothing like this exists for expressions or accumulators, but something like a $push can end up allocating lots of memory.

      For example, the pipeline:

      [{$group: {_id: null, all: {$push: "$$ROOT"}}}]

      will push everything to one giant array. This actually will error if spilling to disk is not enabled, since $group will approximate the size of all accumulators. But once we spill to disk and then begin merging back together there is no memory usage tracking. Because $push tracks just a vector<Value>, that array itself isn't going to grow very large (each Value is just 16 bytes). However each of those Values indicates a sub-tree of memory that can add up to lots and lots of memory in the worst case.

      SERVER-44174 tracks the work to fix $push and $addToSet, while this ticket tracks the work to create a more general solution for all expressions and accumulators.

            backlog-query-execution [DO NOT USE] Backlog - Query Execution
            charlie.swanson@mongodb.com Charlie Swanson
            0 Vote for this issue
            17 Start watching this issue