[SERVER-53760] $unwind + $sort pipeline produces large number of file handles when spilling to disk Created: 13/Jan/21 Updated: 29/Oct/23 Resolved: 11/May/21 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Aggregation Framework |
| Affects Version/s: | 4.4.3 |
| Fix Version/s: | 4.4.7, 5.0.0-rc0 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Mihai Andrei | Assignee: | Mohammad Dashti (Inactive) |
| Resolution: | Fixed | Votes: | 6 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||
| Operating System: | ALL | ||||||||
| Backport Requested: |
v4.4
|
||||||||
| Sprint: | Query 2021-01-25, Query Execution 2021-04-19, Query Execution 2021-05-03, Query Execution 2021-05-17 | ||||||||
| Participants: | |||||||||
| Case: | (copied to CRM) | ||||||||
| Description |
|
Suppose we have an aggregation which features a $unwind followed by a $sort that spills to disk. $unwind does not produce owned documents, rather, it updates the field being unwound in memory while preserving the backing BSON. The sorter relies on ‘memUsageForSorter()’, (which simply calls 'Document::getApproximateSize()') to compute the current memory usage and determine whether or not we should spill. Crucially, ‘getApproximateSize()’ includes the size of the backing BSON (the document that is input to $unwind), which is massive in size compared to each document produced by $unwind. As a result, this causes the $sort which follows to reach its memory limit quickly and produces an enormous number of file handles when spilling to disk compared to 4.2. |
| Comments |
| Comment by Githook User [ 20/May/21 ] |
|
Author: {'name': 'Mohammad Dashti', 'email': 'mdashti@gmail.com', 'username': 'mdashti'}Message: |
| Comment by Githook User [ 20/May/21 ] |
|
Author: {'name': 'Mohammad Dashti', 'email': 'mdashti@gmail.com', 'username': 'mdashti'}Message: Co-authored-by: Mihai Andrei <mihai.andrei@10gen.com> (cherry picked from commit ef3c46f76af3b8e2ca92cc9f071885d5b49998fb) |
| Comment by Githook User [ 10/May/21 ] |
|
Author: {'name': 'Mohammad Dashti', 'email': 'mdashti@gmail.com', 'username': 'mdashti'}Message: Co-authored-by: Mihai Andrei <mihai.andrei@10gen.com> |