[SERVER-59834] $group with allowDiskUse doesn't clean up _tmp files Created: 08/Sep/21  Updated: 29/Oct/23  Resolved: 13/Jul/22

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 5.0.0, 5.0.2, 6.1.0-rc0
Fix Version/s: 5.0.12, 6.1.0-rc0

Type: Bug Priority: Major - P3
Reporter: Bruce Lucas (Inactive) Assignee: Alberto Massari
Resolution: Fixed Votes: 1
Labels: query-director-triage
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Problem/Incident
Related
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v5.0
Sprint: QE 2022-05-16, QE 2022-05-30, QE 2022-06-13, QE 2022-06-27, QE 2022-07-11, QE 2022-07-25
Participants:
Linked BF Score: 60

 Description   

UPDATE: There's a dedicated test in master branch group_tmp_file_cleanup.js, however, it starts to fail only with memoryLimitMb >= 15.

This seems to be a fairly general issue, because the following simple test reproduces the problem:

function repro() {
 
    db.c.drop()
 
    print("inserting")
    doc = {x: "x".repeat(1000)}
    many = Array(1000).fill(doc)
    for (var i = 0; i < 100; i++)
        db.c.insertMany(many)
 
    print("aggregating")
    r = db.c.aggregate([
        {
            $group: {
                "_id": "$_id",
                x: {$push: "$x"},
            }
        }
    ], {
        allowDiskUse: true
    })
}



 Comments   
Comment by Githook User [ 19/Aug/22 ]

Author:

{'name': 'Alberto Massari', 'email': 'alberto.massari@mongodb.com', 'username': 'albymassari'}

Message: SERVER-59834 Backport test to v5.0
Branch: v5.0
https://github.com/mongodb/mongo/commit/0bc559a38b06ca9b408481c33e39d3187aa636be

Comment by Githook User [ 13/Jul/22 ]

Author:

{'name': 'Alberto Massari', 'email': 'alberto.massari@mongodb.com', 'username': 'albymassari'}

Message: SERVER-59834: ensure pipeline has completed before checking for leftover files
Branch: master
https://github.com/mongodb/mongo/commit/168088aca1c60bd984d279a35336f6f036ed7c62

Comment by Romans Kasperovics [ 29/Jun/22 ]

arun.banala@mongodb.com the problem still exists in master (and probably also on 5.0, but I don't remember if I checked it).

Comment by Kyle Suarez [ 02/Jun/22 ]

I reopened the backport request and sent it to the Query Execution triage queue. romans.kasperovics@mongodb.com, I assume it cherry-picks cleanly because it's a new file? If the backport passes cleanly in v5.0, I think it's worth backporting as Kelsey and Bruce have suggested.

Comment by Bruce Lucas (Inactive) [ 02/Jun/22 ]

Personally I think this is worth backporting.

Comment by Githook User [ 24/May/22 ]

Author:

{'name': 'romanskas', 'email': '30618745+romanskas@users.noreply.github.com', 'username': 'romanskas'}

Message: SERVER-59834 Add test for _tmp files cleanup for $group
Branch: master
https://github.com/mongodb/mongo/commit/218f4a72dcb54bb6f6b88354cda9296bce9fe606

Comment by Steve La (Inactive) [ 23/Nov/21 ]

Remainder work is to confirm this is fixed and to add a test if none is missing

Comment by Bruce Lucas (Inactive) [ 09/Sep/21 ]

Great, thanks!

Comment by Gregory Noma [ 09/Sep/21 ]

bruce.lucas I believe this was because previously, DocumentSourceGroup relinquished file deletion responsibilities to the MergeIterator. Before 5.0, this behavior was correct. However the ownership of file deletion was changed at some point seemingly between 4.4 and 5.0 and, since we didn't have any testing for this, the issue went unnoticed. SERVER-54791 fixed this by consolidating file deletion responsibilities in the new Sorter::File class. Even if it is fixed now, I agree that it would be prudent to add a test.

Comment by Bruce Lucas (Inactive) [ 09/Sep/21 ]

Also, should we add a test that _tmp files are properly cleaned up?

Comment by Bruce Lucas (Inactive) [ 09/Sep/21 ]

gregory.noma, that would be good news. Since that ticket wasn't particularly aimed at fixing this issue, it would be good to identify which particular aspect of that ticket fixed this issue.

Comment by Gregory Noma [ 08/Sep/21 ]

This may have been fixed in SERVER-54791.

Generated at Thu Feb 08 05:48:14 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.