[SERVER-49506] Excessive memory being used in pipelines with deeply nested documents Created: 14/Jul/20 Updated: 06/Oct/20 Resolved: 06/Oct/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Aggregation Framework |
| Affects Version/s: | 4.2.2 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Kevin Arhelger | Assignee: | Ian Boros |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | qexec-team | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||||||
| Issue Links: |
|
||||||||||||||||
| Operating System: | ALL | ||||||||||||||||
| Sprint: | Query 2020-10-05, Query 2020-10-19 | ||||||||||||||||
| Participants: | |||||||||||||||||
| Case: | (copied to CRM) | ||||||||||||||||
| Description |
|
Included are memory allocations flamegraphs showing a pair of aggregations with deeply nested documents and $redact using 35GB of memory or 18.5GB each. I do not have the exact aggregation and example document(s) to reproduce this issue, but I believe deeply nested documents with $redact should show similar issues. |
| Comments |
| Comment by Ian Boros [ 06/Oct/20 ] |
|
After discussion with kevin.arhelger we've agreed that this is likely a duplicate of |
| Comment by Ian Boros [ 06/Oct/20 ] |
|
Correction to my last comment: the document caching only exists in 4.4 and later, so I would not expect the $redact to actually blow up memory usage in the way I described. The memory usage bug with $facet is the most likely explanation here. As mentioned, see I was unable to access the logs in the support case as it looks like they've been deleted (attempts to access them result in "The specified key does not exist"). |
| Comment by Ian Boros [ 30/Sep/20 ] |
|
My guess is that the issue here is the combination of $redact, a blocking stage (both sort and group appear after the $redact in the flamegraph), and $facet. For some context, $redact walks the input document, which brings each field into the Document's cache. It then builds a a new document using MutableDocument which is fully cached. We know from prior experience that the memory usage of a fully cached Document is far more than that of a plain BSON object because of all of the overhead involved in maintaining the Document's structure (pointers to children, the document's hash table, etc). A fully cached document could be 3-4x the size of the BSON object that it represents. Putting a lot of fully cached documents into a blocking stage could certainly cause a lot of memory usage. I also see in the flame graph use of DocumentSourceFacet ($facet). For a while, $facet did not enforce any limit on its total memory usage. My understanding is that while there were limits about the size of an individual document, $facet would execute each of its sub-pipelines to completion. So as long as each document was under a certain size threshold, memory usage could grow without bound. A fix for this was merged under I'm returning this to "needs scheduling" for discussion. |