Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Fixed
Priority: Major - P3
Fix Version/s: 4.1.7
Affects Version/s: None
Component/s: Sharding
Labels:
- sharding-wfbf-day

Backwards Compatibility:
Fully Compatible
Operating System:
ALL
Sprint:
Sharding 2018-12-17, Sharding 2018-12-31, Sharding 2019-01-14
Linked BF Score:
45
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

The range deleter attempts to find documents which have since been migrated away from a shard and delete them. When performing the deletion, it correctly wraps the call to Collection::deleteDocument inside a writeConflictRetry (link:

collection_range_deleter.cpp

        exec->saveState();

        writeConflictRetry(opCtx, "delete range", nss.ns(), [&] {
            WriteUnitOfWork wuow(opCtx);
            if (saver) {
                uassertStatusOK(saver->goingToDelete(obj));
            }
            collection->deleteDocument(opCtx, kUninitializedStmtId, rloc, nullptr, true);
            wuow.commit();
        });

        try {
            exec->restoreState();

The call to collection->deleteDocument() will end up looking up the document and asserting that the document exists. This should generally be true, but if we encounter a write conflict exception, the writeConflictRetry loop will abandon the snapshot in between attempts:

write_conflict_exception.h

    int attempts = 0;
    while (true) {
        try {
            return f();
        } catch (WriteConflictException const&) {
            CurOp::get(opCtx)->debug().additiveMetrics.incrementWriteConflicts(1);
            WriteConflictException::logAndBackoff(attempts, opStr, ns);
            ++attempts;
            opCtx->recoveryUnit()->abandonSnapshot();
        }
    }

Once the snapshot has been abandoned, we need to be able to handle the document no longer existing on the next attempt.

So I would propose that the code in collection_range_deleter.cpp either (1) somehow change the PlanExecutor constructed to include a delete stage which is able to handle the document no longer existing or (2) to manually check before each delete whether the document still exists.

Assignee:: Randolph Tan
Reporter:: Charlie Swanson
Participants:: Charlie Swanson, Githook User, Randolph Tan
Votes:: 0 Vote for this issue
Watchers:: 3 Start watching this issue

Created:: Nov 16 2018 05:59:36 PM UTC
Updated:: Oct 29 2023 10:26:29 PM UTC
Resolved:: Jan 11 2019 07:28:17 PM UTC
Confidence Status Last Update:: 03/Dec/18 9:37 PM

Details

Description

Attachments

Activity

People

Dates