|
Hey jerrytstng@gmail.com, thank you for reaching out. I reviewed the jepsen.log file embedded in the jepsen.zip archive and noticed similar to your report in SERVER-75296 that none of the --read-concern, --write-concern, --txn-read-concern, or --txn-write-concern options were specified in your command invocation. This means the system-default read and write concern levels are being used for the test run. In MongoDB 6.0, the default read concern level is "local" and the default write concern level is "majority". Read concern level "local" isn't sufficient for providing read committed isolation (see also our documentation on Read Isolation which discusses this) and so certain classes of anomalies the Elle verifier checks for are to be expected. I'll expand more on why the anomaly you've observed is one of these 'to be expected' ones.
|
jepsen.log
|
lein run test --workload list-append --nemesis all --nodes-file /root/nodes --time-limit 86400 --test-count 1 --sharded
|
One detail which isn't covered in our public documentation is how read concern level "local" and read concern level "majority" for transactions which successfully commit with write concern {w: "majority"} have identical behavior. This is a direct outcome of non-snapshot transactions using what is referred to as 'speculative majority' to reduce the number of spurious write conflicts in back-to-back transactions. The core idea is for non-snapshot transactions to read the latest view of the primary's data and for the safety of read concern level "majority" (i.e. avoiding reading data which could roll back == dirty read) is satisfied by waiting for the effect of the transactions to have replicated at write concern {w: "majority"}. For transactions which only do reads (and no writes) there is a no-op write which is replicated and waited on at commit time. More details are summarized here in the mongodb/mongo repository which is targeted at engineers who work on the MongoDB Server itself.
Given how I've described the current implementation for transactions uses speculative majority it'd be reasonable to ask where the source of the anomaly originates from then. While running a test case Jepsen has its workers perform a sequence of operations. Further examination of the incompatible-order anomaly reveals the inconsistent reads are only ever observed by cases where only one operation is performed in the sequence of operations. By default, the Jepsen list-append test won't run these operations in a transaction. This means the speculative majority behavior won't apply to them. These reads performed outside of a transaction can therefore read data which may roll back and cannot provide read committed isolation. (Again, please refer to the Read Isolation documentation.) Note that the Jepsen list-append test supports the --singleton-txns command line option to force the execution of the single operation being performed within a transaction. You would find that by including --singleton-txns in your command line option the incompatible-order anomaly goes away.
Below is a few lines extracted from the jepsen.log file showing the anomalous reads are only having one operation performed in the sequence of operations. The same holds true for the other incompatible-order reads from the shared output as well.
2023-04-19 07:17:45,573{GMT} INFO [jepsen worker 0] jepsen.util: [ 15267558117007 ] 23319 :ok :txn [[:r 290 [4 5 6 11 19]]]
|
2023-04-19 07:45:38,578{GMT} INFO [jepsen worker 3] jepsen.util: [ 16940562771394 ] 26184 :ok :txn [[:r 308 [22 23 19 16 24 45 46]]]
|
2023-04-19 07:52:18,200{GMT} INFO [jepsen worker 8] jepsen.util: [ 17340185198991 ] 26819 :ok :txn [[:r 310 [10 25 18]]]
|
Lastly, I'll note if you're wanting to continue to run the Jepsen list-append test with non-snapshot read concerns then you would benefit from making a change similar to the following for how the Elle model checker is configured. The G-nonadjacent, G-single, and G-single-realtime anomalies which were also mentioned in the jepsen.log file refer to snapshot isolation violations and are to be expected in such an application configuration as covered by my other comment in SERVER-75296.
diff --git a/src/jepsen/mongodb/list_append.clj b/src/jepsen/mongodb/list_append.clj
|
index 828100a..3b63ddd 100644
|
--- a/src/jepsen/mongodb/list_append.clj
|
+++ b/src/jepsen/mongodb/list_append.clj
|
@@ -141,5 +141,5 @@
|
:key-dist :exponential
|
:max-txn-length (:max-txn-length opts 4)
|
:max-writes-per-key (:max-writes-per-key opts)
|
- :consistency-models [:strong-snapshot-isolation]})
|
+ :consistency-models [:read-committed]})
|
:client (Client. nil)))
|
|