|
The range deleter attempts to find documents which have since been migrated away from a shard and delete them. When performing the deletion, it correctly wraps the call to Collection::deleteDocument inside a writeConflictRetry (link:
|
collection_range_deleter.cpp
|
exec->saveState();
|
|
writeConflictRetry(opCtx, "delete range", nss.ns(), [&] {
|
WriteUnitOfWork wuow(opCtx);
|
if (saver) {
|
uassertStatusOK(saver->goingToDelete(obj));
|
}
|
collection->deleteDocument(opCtx, kUninitializedStmtId, rloc, nullptr, true);
|
wuow.commit();
|
});
|
|
try {
|
exec->restoreState();
|
The call to collection->deleteDocument() will end up looking up the document and asserting that the document exists. This should generally be true, but if we encounter a write conflict exception, the writeConflictRetry loop will abandon the snapshot in between attempts:
|
write_conflict_exception.h
|
int attempts = 0;
|
while (true) {
|
try {
|
return f();
|
} catch (WriteConflictException const&) {
|
CurOp::get(opCtx)->debug().additiveMetrics.incrementWriteConflicts(1);
|
WriteConflictException::logAndBackoff(attempts, opStr, ns);
|
++attempts;
|
opCtx->recoveryUnit()->abandonSnapshot();
|
}
|
}
|
Once the snapshot has been abandoned, we need to be able to handle the document no longer existing on the next attempt.
So I would propose that the code in collection_range_deleter.cpp either (1) somehow change the PlanExecutor constructed to include a delete stage which is able to handle the document no longer existing or (2) to manually check before each delete whether the document still exists.
|