-
Type: Bug
-
Resolution: Duplicate
-
Priority: Major - P3
-
None
-
Affects Version/s: 4.1.6
-
Component/s: None
-
None
-
ALL
-
Repl 2019-01-28
Once a secondary read only transaction starts and gets stashed, the applier trying to get the global X will block (like trying to replicate create collection command). And once the X lock is queued, new requests to secondary will queue behind the X lock until it is satisfied/abandoned. This can cause a deadlock scenario like this:
1. Txn1 starts, do ops, stash locks.
2. Repl applier request global X, conflicts with stashed locks, and lock request gets queued.
3. Txn1 continues, checks out session, tries to satisfy read concern, which involves checking if oplog collection exists that requires global IS, so it gets queued behind #2 (note: this is before locks gets unstashed).
4. Periodic Txn Killer sees that Txn1 is already expired, tries to kill it by checking out the session, but it is blocked waiting for step#3 to check the session back in.
More notes:
Periodic Txn Killer actually kills the opCtx of session before trying to check it out, but step#3 is also blocked on pbwm resource mutex while trying to grab GlobalLock. And this operation doesn't use opCtx so it cannot be interrupted by killOp.
- duplicates
-
SERVER-39139 Remove testing support for secondary transactions
- Closed
- is related to
-
SERVER-39139 Remove testing support for secondary transactions
- Closed