[SERVER-53760] $unwind + $sort pipeline produces large number of file handles when spilling to disk Created: 13/Jan/21  Updated: 29/Oct/23  Resolved: 11/May/21

Status: Closed
Project: Core Server
Component/s: Aggregation Framework
Affects Version/s: 4.4.3
Fix Version/s: 4.4.7, 5.0.0-rc0

Type: Bug Priority: Major - P3
Reporter: Mihai Andrei Assignee: Mohammad Dashti (Inactive)
Resolution: Fixed Votes: 6
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Related
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.4
Sprint: Query 2021-01-25, Query Execution 2021-04-19, Query Execution 2021-05-03, Query Execution 2021-05-17
Participants:
Case:

 Description   

Suppose we have an aggregation which features a $unwind followed by a $sort that spills to disk. $unwind does not produce owned documents, rather, it updates the field being unwound in memory while preserving the backing BSON. The sorter relies on ‘memUsageForSorter()’, (which simply calls 'Document::getApproximateSize()') to compute the current memory usage and determine whether or not we should spill. Crucially, ‘getApproximateSize()’ includes the size of the backing BSON (the document that is input to $unwind), which is massive in size compared to each document produced by $unwind. As a result, this causes the $sort which follows to reach its memory limit quickly and produces an enormous number of file handles when spilling to disk compared to 4.2.



 Comments   
Comment by Githook User [ 20/May/21 ]

Author:

{'name': 'Mohammad Dashti', 'email': 'mdashti@gmail.com', 'username': 'mdashti'}

Message: SERVER-53760 Fixed the `sort_spill_estimate_data_size.js` test to become more determistic
Branch: master
https://github.com/mongodb/mongo/commit/dfaa8ca5b078c2bf03c75462846d5155cade4e2c

Comment by Githook User [ 20/May/21 ]

Author:

{'name': 'Mohammad Dashti', 'email': 'mdashti@gmail.com', 'username': 'mdashti'}

Message: SERVER-53760 Improved document size approximation for spilling to disk

Co-authored-by: Mihai Andrei <mihai.andrei@10gen.com>

(cherry picked from commit ef3c46f76af3b8e2ca92cc9f071885d5b49998fb)
Branch: v4.4
https://github.com/mongodb/mongo/commit/d0a98dd414ab112d4a933b727fc6eeddc1394c91

Comment by Githook User [ 10/May/21 ]

Author:

{'name': 'Mohammad Dashti', 'email': 'mdashti@gmail.com', 'username': 'mdashti'}

Message: SERVER-53760 Improved document size approximation for spilling to disk

Co-authored-by: Mihai Andrei <mihai.andrei@10gen.com>
Branch: master
https://github.com/mongodb/mongo/commit/ef3c46f76af3b8e2ca92cc9f071885d5b49998fb

Generated at Thu Feb 08 05:31:47 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.