[SERVER-38179] range deleter must be prepared for document to be deleted from under it Created: 16/Nov/18  Updated: 29/Oct/23  Resolved: 11/Jan/19

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: 4.1.7

Type: Bug Priority: Major - P3
Reporter: Charlie Swanson Assignee: Randolph Tan
Resolution: Fixed Votes: 0
Labels: sharding-wfbf-day
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Sharding 2018-12-17, Sharding 2018-12-31, Sharding 2019-01-14
Participants:
Linked BF Score: 45

 Description   

The range deleter attempts to find documents which have since been migrated away from a shard and delete them. When performing the deletion, it correctly wraps the call to Collection::deleteDocument inside a writeConflictRetry (link:

collection_range_deleter.cpp

        exec->saveState();
 
        writeConflictRetry(opCtx, "delete range", nss.ns(), [&] {
            WriteUnitOfWork wuow(opCtx);
            if (saver) {
                uassertStatusOK(saver->goingToDelete(obj));
            }
            collection->deleteDocument(opCtx, kUninitializedStmtId, rloc, nullptr, true);
            wuow.commit();
        });
 
        try {
            exec->restoreState();

The call to collection->deleteDocument() will end up looking up the document and asserting that the document exists. This should generally be true, but if we encounter a write conflict exception, the writeConflictRetry loop will abandon the snapshot in between attempts:

write_conflict_exception.h

    int attempts = 0;
    while (true) {
        try {
            return f();
        } catch (WriteConflictException const&) {
            CurOp::get(opCtx)->debug().additiveMetrics.incrementWriteConflicts(1);
            WriteConflictException::logAndBackoff(attempts, opStr, ns);
            ++attempts;
            opCtx->recoveryUnit()->abandonSnapshot();
        }
    }

Once the snapshot has been abandoned, we need to be able to handle the document no longer existing on the next attempt.

So I would propose that the code in collection_range_deleter.cpp either (1) somehow change the PlanExecutor constructed to include a delete stage which is able to handle the document no longer existing or (2) to manually check before each delete whether the document still exists.



 Comments   
Comment by Githook User [ 11/Jan/19 ]

Author:

{'username': 'renctan', 'email': 'randolph@10gen.com', 'name': 'Randolph Tan'}

Message: SERVER-38179 range deleter must be prepared for document to be deleted from under it
Branch: master
https://github.com/mongodb/mongo/commit/a83b8477796991c522199cdd5b53800ae08c1e55

Comment by Githook User [ 11/Jan/19 ]

Author:

{'username': 'renctan', 'email': 'randolph@10gen.com', 'name': 'Randolph Tan'}

Message: SERVER-38179 Refactor RemoveSaver out of dbhelpers
Branch: master
https://github.com/mongodb/mongo/commit/891ca0c23f979268fa0b9403500a8a582646613b

Generated at Thu Feb 08 04:48:11 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.