[SERVER-40421] Add failpoint to skip doing retries on WiredTiger prepare conflicts Created: 01/Apr/19  Updated: 29/Oct/23  Resolved: 09/Apr/19

Status: Closed
Project: Core Server
Component/s: Concurrency, Testing Infrastructure
Affects Version/s: None
Fix Version/s: 4.1.11

Type: New Feature Priority: Major - P3
Reporter: Max Hirschhorn Assignee: Vlad Rachev (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Related
is related to SERVER-40176 Cursor seekExact should not use WT_CU... Closed
Backwards Compatibility: Fully Compatible
Sprint: STM 2019-04-08, STM 2019-04-22
Participants:
Story Points: 1

 Description   

An operation within WiredTiger that attempts to get or set a value which has been prepared by another transaction may have a WT_PREPARE_CONFLICT error returned. (Note that until SERVER-40176 is addressed, this also applies to operations which may scan over such data.) The MongoDB layer then enqueues these operations to be retried after a prepared transaction has committed or aborted. In order to allow the rollback fuzzer to generate randomized insert, update, and delete operations that may prepare conflcits without hanging, it would be useful to add a failpoint to the wiredTigerPrepareConflictRetry() function where it doesn't do any retry logic and instead has the command fail with a WriteConflict error response.



 Comments   
Comment by Luke Chen [ 11/Apr/19 ]

Fixing up the fixVersion as this ticket was not included as part of 4.1.10 release.

Comment by Githook User [ 09/Apr/19 ]

Author:

{'email': 'vlad.rachev@mongodb.com', 'name': 'vrachev', 'username': 'vrachev'}

Message: SERVER-40421 Add failpoint to skip doing retries on WiredTiger prepare conflicts
Branch: master
https://github.com/mongodb/mongo/commit/2726092ccd5249ce44a0d0c784d612028036cb3e

Comment by Max Hirschhorn [ 01/Apr/19 ]

The following diff was sufficient for getting the rollback fuzzer to run successfully with prepared transactions but needs testing similar to the skip_write_conflict_retries_failpoint.js test.

diff --git a/src/mongo/db/storage/wiredtiger/wiredtiger_prepare_conflict.cpp b/src/mongo/db/storage/wiredtiger/wiredtiger_prepare_conflict.cpp
index 730d3e998c..c11efccf5c 100644
--- a/src/mongo/db/storage/wiredtiger/wiredtiger_prepare_conflict.cpp
+++ b/src/mongo/db/storage/wiredtiger/wiredtiger_prepare_conflict.cpp
@@ -41,6 +41,8 @@ namespace mongo {
 // When set, simulates WT_PREPARE_CONFLICT returned from WiredTiger API calls.
 MONGO_FAIL_POINT_DEFINE(WTPrepareConflictForReads);
 
+MONGO_FAIL_POINT_DEFINE(WTSkipPrepareConflictRetries);
+
 void wiredTigerPrepareConflictLog(int attempts) {
     LOG(1) << "Caught WT_PREPARE_CONFLICT, attempt " << attempts
            << ". Waiting for unit of work to commit or abort.";
diff --git a/src/mongo/db/storage/wiredtiger/wiredtiger_prepare_conflict.h b/src/mongo/db/storage/wiredtiger/wiredtiger_prepare_conflict.h
index 7833c506ec..f3031a3105 100644
--- a/src/mongo/db/storage/wiredtiger/wiredtiger_prepare_conflict.h
+++ b/src/mongo/db/storage/wiredtiger/wiredtiger_prepare_conflict.h
@@ -37,9 +37,12 @@
 
 namespace mongo {
 
-// When set, returns simulates returning WT_PREPARE_CONFLICT on WT cursor read operations.
+// When set, simulates returning WT_PREPARE_CONFLICT on WT cursor read operations.
 MONGO_FAIL_POINT_DECLARE(WTPrepareConflictForReads);
 
+// When set, WT_ROLLBACK is returned in place of retrying on WT_PREPARE_CONFLICT errors.
+MONGO_FAIL_POINT_DECLARE(WTSkipPrepareConflictRetries);
+
 /**
  * Logs a message with the number of prepare conflict retry attempts.
  */
@@ -66,6 +69,15 @@ int wiredTigerPrepareConflictRetry(OperationContext* opCtx, F&& f) {
     CurOp::get(opCtx)->debug().additiveMetrics.incrementPrepareReadConflicts(1);
     wiredTigerPrepareConflictLog(attempts);
 
+    if (MONGO_FAIL_POINT(WTSkipPrepareConflictRetries)) {
+        // Callers of wiredTigerPrepareConflictRetry() should eventually call wtRCToStatus() via
+        // invariantWTOK() and have the WT_ROLLBACK error bubble up as a WriteConflictException.
+        // Enabling the "skipWriteConflictRetries" failpoint in conjunction with the
+        // "WTSkipPrepareConflictRetries" failpoint prevents the higher layers from retrying the
+        // entire operation.
+        return WT_ROLLBACK;
+    }
+
     while (true) {
         attempts++;
         auto lastCount = recoveryUnit->getSessionCache()->getPrepareCommitOrAbortCount();

Generated at Thu Feb 08 04:54:55 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.