-
Type:
Bug
-
Resolution: Gone away
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: Cache and Eviction
-
Security Level: Public (Available to anyone on the web)
-
Storage Engines - Foundations
-
4,061.349
-
SE Foundations - Q4+ Backlog
-
5
The entire step-up procedure in disagg mongod happens while holding the SLS state machine mutex. Reads require this mutex to read mongod state, and so any blocking code in the step-up path will block reads from succeeding.
A cloud dev cluster experienced deadlocks due to this, and upon looking at the core dump it looks like the step-up thread is stuck in WT eviction code. I spun this ticket out from SERVER-111830.
version of mongod is "buildInfo":{"version":"8.3.0-alpha0-1949-g7d9ab91","gitVersion":"7d9ab91d278c1da8838a9c25473ab159c5d44714"
core dump link
guide to looking at cloud-dev core dumps
- is duplicated by
-
WT-15765 disagg deadlock on startup in session_truncate()
-
- Closed
-
-
WT-15387 test/format (disagg.mode=switch) failed to step-up as primary: WT_ROLLBACK concurrent conflict
-
- Closed
-
- is related to
-
WT-14949 Add a check that all transactions/cursors should be closed when step down/step up happens in connection->reconfigure
-
- Open
-
-
WT-14735 Layered tables performance improvements
-
- Backlog
-
-
WT-15798 Temporarily ignore updates and dirty threshold during step-up
-
- Closed
-
-
WT-15595 Ignore eviction target/trigger in follower mode
-
- Closed
-
-
WT-15449 test/format (disagg.mode=switch) cache stuck
-
- Closed
-
- related to
-
WT-15765 disagg deadlock on startup in session_truncate()
-
- Closed
-
-
WT-15808 Support readers when performing step-up
-
- Open
-
-
WT-15387 test/format (disagg.mode=switch) failed to step-up as primary: WT_ROLLBACK concurrent conflict
-
- Closed
-