-
Type:
Task
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
Correctness 2026-03-24
-
200
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
- Summary
Fixes WiredTiger timestamp conflicts in Antithesis testing when multiple ValidateCollections hooks run concurrently on the same MongoDB cluster.
-
- Problem
In FSM workload tests with `maxTestQueueSize > 1`, concurrent test jobs share the same MongoDB fixture. When tests complete, their ValidateCollections hooks run simultaneously, causing conflicts:
- Job 1's hook inserts into `test.validate.hook` at timestamp T for internode validation
- Job 2's hook runs validate commands concurrently, reading at timestamp T
- *Conflict*: Job 1 cannot commit at T because Job 2 is already reading at T
- *Error*: `commit timestamp must be after all active read timestamps`
-
- Solution
Add a global `threading.Lock` to serialize ValidateCollections hooks across all jobs:
- *replicaset.py*: Define `_GLOBAL_VALIDATION_LOCK` at module level
- *shardedcluster.py*: Reference the same lock via `_validation_lock` attribute
- *validate.py*: Acquire fixture's `_validation_lock` before running validation
Only one ValidateCollections hook can execute at a time, preventing timestamp conflicts while maintaining backward compatibility with fixtures that lack the lock.
-
- Why
SERVER-115225Wasn't Enough
- Why
SERVER-115225 improved timestamp accuracy for a single hook but didn't address concurrent hooks running on the same cluster.
- is related to
-
SERVER-115225 Use better timestamp for validate atClusterTime
-
- Closed
-