-
Type: Bug
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: Replication
-
None
-
Fully Compatible
-
ALL
-
v4.2
-
Repl 2019-07-01, Repl 2019-07-15, Repl 2019-07-29
Consider the following state transitions.
Step down (primary to secondary).
1. Primary, stashes its lock resources for prepared transactions with maxLockTimeout set to non-zero value (by default it is 5ms) . And, its stash style is StashStyle::kPrimary.
2. Step down unstash the lock resources of prepared transactions from the transactionParticipant to its opCtx. Then, stashes the lock resources in StashStyle::kSecondary and preserving the maxLockTimeout value.
3. Now, the node state is secondary and if we try to apply a commit command for that transaction via secondary oplog application, we unstash the lock resource with maxLockTimeout set. Unstashing the lock resources will reacquire the yielded locks and reacquire the ticket with maxLockTimeout set. This can failĀ and it can lead to server crash.
Step Up (Secondary to Primary).
1. Secondary, stashes its lock resources for prepared transactions with maxLockTimeout unset. And, its stash style is StashStyle::kSecondary.
2. Step Up unstash the lock resources of prepared transactions from the transactionParticipant to its opCtx. Then, stashes the lock resources in StashStyle::kPrimary and preserving the maxLockTimeout value.
3. Now, the node state is primary and if we try to apply a commit command for that transaction, we unstash the lock resource with maxLockTimeout unset. This means we reacquire the ticket with no maxLockTimeout. This means we are breaking the contract on transaction timeout.
- related to
-
SERVER-41556 Must handle failure to reacquire locks and ticket when unstashing transaction
- Closed