[SERVER-65285] Gracefully handle empty group-by key when spilling in HashAgg Created: 06/Apr/22  Updated: 29/Oct/23  Resolved: 13/Apr/22

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 5.3.2

Type: Bug Priority: Major - P3
Reporter: Eric Cox (Inactive) Assignee: Eric Cox (Inactive)
Resolution: Fixed Votes: 0
Labels: sbe
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v5.3
Steps To Reproduce:

const insertColl = db.getDB("test").foo;
for (let i = 0; i < 500; ++i) {
    assert.commandWorked(insertColl.insert({a: i, string: "test test test"}));
}
 
assert.commandWorked(db.adminCommand(
    {setParameter: 1, internalQuerySlotBasedExecutionHashAggApproxMemoryUseInBytesBeforeSpill: 1}));assert.commandWorked(db.adminCommand(
    {setParameter: 1, internalQuerySlotBasedExecutionHashLookupApproxMemoryUseInBytesBeforeSpill: 1}));
 
pipeline = [{$lookup: {from: readColl.getName(), localField: "a", foreignField: "a", as: "results"}}];

let res =
    readColl
        .aggregate(
            pipeline,
            {allowDiskUse: true})
        .toArray();

Invariant tripped:

{"t":{"$date":"2022-04-05T22:20:32.074+00:00"},"s":"F", "c":"ASSERT", "id":23081, "ctx":"conn1","msg":"Invariant failure","attr":{"expr":"size > 0","msg":"key size must be greater than 0","file":"src/mongo/db/record_id.h","line":92}}`

Sprint: QE 2022-04-18
Participants:

 Description   

I noticed when spilling to disk in $lookup plans that use HashAgg we can get into a situation where we only have a single row in the HashAgg stage with an empty group by key. If this spills we can trip an invariant in the server.

I fixed this when merging spilling in HashLookupStage and added a C++ for this exact scenario in the HashAgg unittests. 

We should use this ticket to backport the fix applied in SERVER-62739 which is to check the _inKeyAccessor.size() == 0 and bail out of spilling. 

https://github.com/10gen/mongo/pull/4099/files#diff-e1665f068faa10a7e7304141f4f42562dc0a2c1691e77d434ec24c106b3baa66R264

This is okay because we will only have a single row in the hash table and there's no point in spilling to disk because we will have spill and re-load the value into memory when updating it via point the switch accessor to the value and running the bytecode.



 Comments   
Comment by Githook User [ 12/Apr/22 ]

Author:

{'name': 'Eric Cox', 'email': 'eric.cox@mongodb.com', 'username': 'ericox'}

Message: SERVER-65285 Gracefully handle empty group-by key when spilling in HashAgg
Branch: v5.3
https://github.com/mongodb/mongo/commit/c0036901e114de3411f4b3d6a1da8f2c0196c6d8

Generated at Thu Feb 08 06:02:20 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.