[SERVER-40700] Deadlock between read prepare conflicts and state transitions Created: 17/Apr/19 Updated: 29/Oct/23 Resolved: 16/May/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication, Storage |
| Affects Version/s: | None |
| Fix Version/s: | 4.1.12 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Judah Schvimer | Assignee: | Pavithra Vetriselvan |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | prepare_durability, todo_in_code | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Sprint: | Repl 2019-05-06, Repl 2019-05-20 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Description |
|
We do not kill user or internal reads on step down, similarly to how we do not kill internal writes on step down. Thus if they hit a prepare conflict they will cause the deadlock described in Solution:
Details:
The S mode acquisition on step up and step down means concurrent state transitions could start happening. To protect against this we will add a new LockManager ResourceMutex that the ReplicationStateTransationLockGuard acquires after acquiring the RSTL, and releases before it. This should be a straightforward way to allow reads and step-ups/step-downs to not conflict (via S and IS locks) but for step-ups and step-downs to conflict even though they take S locks (via the ResourceMutex that does not interact with reads at all). |
| Comments |
| Comment by Githook User [ 16/May/19 ] | |||
|
Author: {'email': 'benety@mongodb.com', 'name': 'Benety Goh', 'username': 'benety'}Message: | |||
| Comment by Githook User [ 16/May/19 ] | |||
|
Author: {'email': 'benety@mongodb.com', 'name': 'Benety Goh', 'username': 'benety'}Message: | |||
| Comment by Githook User [ 16/May/19 ] | |||
|
Author: {'name': 'Pavi Vetriselvan', 'username': 'pvselvan', 'email': 'pvselvan@umich.edu'}Message: | |||
| Comment by Suganthi Mani [ 07/May/19 ] | |||
|
We would not implement the new approach described in the "Description" box. Instead, we will be going with the approach where reads using afterClusterTime that encounter prepare conflicts will be killed during step down. Implementation Detail: Whenever the read thread hits prepare conflict, it waits either on this conditional variable WiredTigerSessionCache::_prepareCommittedOrAbortedCond or for interrupt.
| |||
| Comment by Judah Schvimer [ 06/May/19 ] | |||
|
One note here, after talking with geert.bosch, is that we can upgrade locks already through lock conversion. It's possible we don't currently do this anywhere, but the lock manager supports converting locks from a weaker mode to a stronger mode. This would cause problems with the original design here, but may not with the new plan. | |||
| Comment by Judah Schvimer [ 18/Apr/19 ] | |||
Yes, done.
Great point! I agree. I don't think we need to kill operations that acquire the global lock in S mode anymore. Can you clarify why the 3 way deadlock only manifested with a global lock in S mode and not with a collection or DB lock in S mode? It seems to me that the following would be possible:
I think this scenario goes away when the read acquires RSTL in mode S or IS and the stepdown acquires the RSTL in mode S so they do not conflict. Can you also please clarify why we didn't need to kill S lock holders on step up? I don't think we need to anymore with this new design, but I want to understand why we didn't in the first place.
Can you please remind me why we had the looping behavior in the first place? I looked in | |||
| Comment by Suganthi Mani [ 18/Apr/19 ] | |||
|
judah.schvimer, I am trying to understand the solution and details in the description. And, I have few initial questions regarding that 1) With the new approach, both step down and step up will acquire RSTL in S mode? If yes, Could you update the the sentence "Stepdown will now require RSTL in S." in the description.
With the new approach, I assume that the thread that takes global lock in S mode would be considered as reads and they would take RSTL in S mode.This means, read thread(1) and step down thread(2) doesn't conflict each other. So, I feel we don't want to kill the operations that acquire global lock in S mode. | |||
| Comment by Judah Schvimer [ 17/Apr/19 ] | |||
|
Note that step-up doesn't need to kill operations since the only operations that can take place outside of secondary batch application on secondaries are reads that will no longer conflict with the RSTL. We currently kill operations that acquire global locks in mode S, but we may be able to relax that check here. Otherwise we may need to kill those operations on step-up. |