[SERVER-56502] [sbe][query_fuzzer_standalone] Sort triggers memory limit exceeded on only one version Created: 29/Apr/21  Updated: 29/Oct/23  Resolved: 05/May/21

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 5.0.0-rc0

Type: Bug Priority: Major - P3
Reporter: Kyle Suarez Assignee: Martin Neupauer
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
is depended on by SERVER-52799 Make sbe the default execution engine... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Query Execution 2021-05-03, Query Execution 2021-05-17
Participants:

 Description   

From this task:

Unexpected failure for command: {
        "find" : "fuzzer_coll",
        "sort" : { 
                "array" : 1,
                "obj.str" : 1,
                "obj.obj.obj.obj.obj.num" : -1, 
                "_id" : 1 
        },  
        "limit" : NumberLong(18),
        "projection" : { 
                "sortKey" : { 
                        "$meta" : "sortKey"
                }   
        }   
}
uncaught exception:
 
Error: Query failed.
Version 1 returned result set: [...] // omitted for space
Version 2 returned error: [Error: error: {                                                                                                                                                                                                                                                                                                                                
        "ok" : 0,
        "errmsg" : "PlanExecutor error during aggregation :: caused by :: Sort exceeded memory limit of 104857600 bytes, but did not opt in to external sorting. Aborting operation. Pass allowDiskUse:true to opt in.",
        "code" : 292,
        "codeName" : "QueryExceededMemoryLimitNoDiskUseAllowed"
}] :
assertQueryFuzzerErrorIsAcceptable@jstestfuzz/out/query_fuzzer-2eaf33-1619674604379-0.js:2167:15
validateQueryResultsAndSort/<@jstestfuzz/out/query_fuzzer-2eaf33-1619674604379-0.js:2284:88
validateQueryResults@jstestfuzz/out/query_fuzzer-2eaf33-1619674604379-0.js:528:13
validateQueryResultsAndSort@jstestfuzz/out/query_fuzzer-2eaf33-1619674604379-0.js:2284:9
_loop_1@jstestfuzz/out/query_fuzzer-2eaf33-1619674604379-0.js:2339:9
runFindAndAggregate@jstestfuzz/out/query_fuzzer-2eaf33-1619674604379-0.js:2342:9
main@jstestfuzz/out/query_fuzzer-2eaf33-1619674604379-0.js:2365:17
@jstestfuzz/out/query_fuzzer-2eaf33-1619674604379-0.js:2371:1
@jstestfuzz/out/query_fuzzer-2eaf33-1619674604379-0.js:1:2
failed to load: jstestfuzz/out/query_fuzzer-2eaf33-1619674604379-0.js
exiting with code -3



 Comments   
Comment by Githook User [ 05/May/21 ]

Author:

{'name': 'Martin Neupauer', 'email': 'xmaton@messengeruser.com'}

Message: SERVER-56502 Sort triggers memory limit exceeded on only one version

The TopK sorter keeps track of how much memory it uses to hold the heap.
As it is addind and removing rows from the heap it adds/subtracts the
size of element from the total memory usage. The
KeyString::Value::memUsageForSorter is unfortunately unstable, it can
report different values at different times hence confusing the total
memory usage (it caused underflow).
The fix is to use a stable version KeyString::Value::getSize.
Branch: master
https://github.com/mongodb/mongo/commit/ebffbebedc594fae13719a784f4b5ca409da0b0b

Comment by Justin Seyster [ 03/May/21 ]

The cause of this ticket is that, when TopKSorter is specialized with <MaterializedRow, MaterializedRow>, the memUsageForSorter() value is sometimes different between when an item gets added to the heap and when it gets removed. As a result, the _memUsage value can go negative, causing it to overflow to UNSIGNED_LONG_LONG_MAX, so that it looks like the sort has run out of memory (when it actually has plenty of memory). The only thing I haven't figured out yet is what causes the MaterializedRows to appear to change in size. (They should be immutable.)

Generated at Thu Feb 08 05:39:26 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.