[SERVER-41061] Add repl and sharding concurrency suites with "simultaneous" and "sameCollection" specified Created: 08/May/19  Updated: 30/May/19  Resolved: 30/May/19

Status: Closed
Project: Core Server
Component/s: Replication, Sharding, Testing Infrastructure
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Judah Schvimer Assignee: Max Hirschhorn
Resolution: Won't Do Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Participants:

 Description   

We would likely need to reduce the amount of assertions we can make. At the very least we can make sure we don't crash, deadlock, or have inconsistent data at the end. I suspect this would have caught the deadlocks in SERVER-40700.



 Comments   
Comment by Max Hirschhorn [ 13/May/19 ]

I attempted to run a version of the concurrency_simultaneous_replication.yml test suite which (1) points db[collName] to the same namespace for all FSM workloads and (2) turns the assertAlways(), assertWhenOwnColl(), and assertWhenOwnDB() assertion functions into no-ops. Note that FSM workloads which operate on a unique namespace (by encoding their filename into it) continue to do so.

The test suite failed for a variety of reasons and I imagine it is going to be difficult to stabilize. The most obvious problematic workload was the agg_out_interrupt_cleanup.js running killOp on the operations from other workloads due to them sharing the same "input collection" but there are surely others. judah.schvimer, we should revisit this idea to see if there's another way to try and get coverage of the deadlock scenarios you're interested in.

diff --git a/buildscripts/resmokeconfig/suites/concurrency_simultaneous_replication.yml b/buildscripts/resmokeconfig/suites/concurrency_simultaneous_replication.yml
index 21a5ea716e..65404cfcb2 100644
--- a/buildscripts/resmokeconfig/suites/concurrency_simultaneous_replication.yml
+++ b/buildscripts/resmokeconfig/suites/concurrency_simultaneous_replication.yml
@@ -49,6 +49,7 @@ executor:
       - ValidateCollections
     tests: true
   config:
+    same_collection: true
     shell_options:
       readMode: commands
       global_vars:
diff --git a/jstests/concurrency/fsm_libs/assert.js b/jstests/concurrency/fsm_libs/assert.js
index 437742ac39..f15d3af207 100644
--- a/jstests/concurrency/fsm_libs/assert.js
+++ b/jstests/concurrency/fsm_libs/assert.js
@@ -29,6 +29,7 @@ var AssertLevel = (function() {
     }
 
     return {
+        NEVER: new AssertLevel(-1000),
         ALWAYS: new AssertLevel(0),
         OWN_COLL: new AssertLevel(1),
         OWN_DB: new AssertLevel(2),
diff --git a/jstests/concurrency/fsm_libs/resmoke_runner.js b/jstests/concurrency/fsm_libs/resmoke_runner.js
index 2b6b1512be..ee77db920b 100644
--- a/jstests/concurrency/fsm_libs/resmoke_runner.js
+++ b/jstests/concurrency/fsm_libs/resmoke_runner.js
@@ -45,9 +45,12 @@
             assertLevel = AssertLevel.OWN_COLL;
         }
         if (clusterOptions.sameCollection) {
-            // The collection is shared by multiple workloads, so only make the asserts that always
-            // apply.
-            assertLevel = AssertLevel.ALWAYS;
+            // The collection is shared by multiple workloads, so we can theoretically only make the
+            // asserts that always apply. However, we've never attempted to run with
+            // sameCollection=true. It isn't likely for all the assertAlways() assertions to be
+            // correct. We pessimistically choose to make no assertions while running the FSM
+            // workloads.
+            assertLevel = AssertLevel.NEVER;
         }
         globalAssertLevel = assertLevel;

Generated at Thu Feb 08 04:56:42 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.