[SERVER-34731] Fix race condition in read_concern_snapshot_yielding.js Created: 27/Apr/18  Updated: 29/Oct/23  Resolved: 23/May/18

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: 4.0.0-rc1, 4.1.1

Type: Bug Priority: Major - P3
Reporter: Ian Boros Assignee: Suganthi Mani
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
depends on SERVER-34726 Deadlock with locally stashed transac... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.0
Sprint: Repl 2018-05-21, Repl 2018-06-04
Participants:
Linked BF Score: 66

 Description   

In read_concern_snapshot_yielding.js there's some code which waits for an operation to start. The test is racy because the test assumes that once waitForOpId() has returned, the operation is hanging on a fail point. This is not necessarily true, and can cause the test to fail if the following happens:

1) The operation starts, but does not reach the failPoint
2) waitForOpId is run, and returns true
3) killOp is run
4) The operation checks for interrupt, and terminates (without ever having reached the fail point)
5) assertKillPending() fails because the operation has terminated

I'm marking this as depends-on SERVER-34726 because changing this test might hide the bug described in that ticket.



 Comments   
Comment by Githook User [ 23/May/18 ]

Author:

{'username': 'smani87', 'name': 'Suganthi Mani', 'email': 'suganthi.mani@mongodb.com'}

Message: SERVER-34731 Fixes race condition in read_concern_snapshot_yielding.js

(cherry picked from commit 084e69ab140c37b010146a3acdb3da7e4977d9ea)
Branch: v4.0
https://github.com/mongodb/mongo/commit/469fe1b45febaf10f16833873c1722633e4dfd22

Comment by Githook User [ 23/May/18 ]

Author:

{'username': 'smani87', 'name': 'Suganthi Mani', 'email': 'suganthi.mani@mongodb.com'}

Message: SERVER-34731 Fixes race condition in read_concern_snapshot_yielding.js
Branch: master
https://github.com/mongodb/mongo/commit/084e69ab140c37b010146a3acdb3da7e4977d9ea

Comment by Ian Boros [ 27/Apr/18 ]

Something like this should fix it:

diff --git a/jstests/noPassthrough/read_concern_snapshot_yielding.js b/jstests/noPassthrough/read_concern_snapshot_yielding.js
index 10fdada1f4..7a3bb1cfea 100644
--- a/jstests/noPassthrough/read_concern_snapshot_yielding.js
+++ b/jstests/noPassthrough/read_concern_snapshot_yielding.js
@@ -43,12 +49,22 @@
         let opId;
         assert.soon(
             function() {
-                const res = adminDB
-                                .aggregate([
-                                    {$currentOp: {}},
-                                    {$match: {$and: [{ns: coll.getFullName()}, curOpFilter]}}
-                                ])
-                                .toArray();
+                const res =
+                    adminDB
+                        .aggregate([
+                            {$currentOp: {}},
+                            {
+                              $match: {
+                                  $and: [
+                                      {ns: coll.getFullName()},
+                                      curOpFilter,
+                                      {"msg": "setInterruptOnlyPlansCheckForInterruptHang"}
+                                  ]
+                              }
+                            }
+                        ])
+                        .toArray();
+
                 if (res.length === 1) {
                     opId = res[0].opid;
                     return true;
diff --git a/src/mongo/db/query/plan_yield_policy.cpp b/src/mongo/db/query/plan_yield_policy.cpp
index 9f1c322de8..5552cb2ab5 100644
--- a/src/mongo/db/query/plan_yield_policy.cpp
+++ b/src/mongo/db/query/plan_yield_policy.cpp
@@ -32,6 +32,7 @@
 
 #include "mongo/db/concurrency/write_conflict_exception.h"
 #include "mongo/db/curop.h"
+#include "mongo/db/curop_failpoint_helpers.h"
 #include "mongo/db/operation_context.h"
 #include "mongo/db/query/query_knobs.h"
 #include "mongo/db/query/query_yield.h"
@@ -101,7 +102,11 @@ Status PlanYieldPolicy::yieldOrInterrupt(stdx::function<void()> beforeYieldingFn
         ON_BLOCK_EXIT([this]() { resetTimer(); });
         OperationContext* opCtx = _planYielding->getOpCtx();
         invariant(opCtx);
-        MONGO_FAIL_POINT_PAUSE_WHILE_SET(setInterruptOnlyPlansCheckForInterruptHang);
+        CurOpFailpointHelpers::waitWhileFailPointEnabled(
+            &setInterruptOnlyPlansCheckForInterruptHang,
+            opCtx,
+            "setInterruptOnlyPlansCheckForInterruptHang");
+
         return opCtx->checkForInterruptNoAssert();
     }
 

Generated at Thu Feb 08 04:37:39 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.