[SERVER-57756] Race between concurrent stepdowns and applying transaction oplog entry Created: 16/Jun/21 Updated: 29/Oct/23 Resolved: 07/Jul/21 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 5.0.2, 5.1.0-rc0 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Samyukta Lanka | Assignee: | Wenbin Zhu |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||||||||||||||||||||||||||
| Backport Requested: |
v5.0, v4.4, v4.2
|
||||||||||||||||||||||||||||||||||||||||||||
| Sprint: | Repl 2021-06-28, Repl 2021-07-12 | ||||||||||||||||||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||||||||||||||||||
| Case: | (copied to CRM) | ||||||||||||||||||||||||||||||||||||||||||||
| Linked BF Score: | 120 | ||||||||||||||||||||||||||||||||||||||||||||
| Description |
|
If a node is stepping down in multiple threads at once, one thread can lag behind the other stepdown thread and still be running when the node starts applying oplog entries as a secondary. In this case, if the remaining stepdown thread still has ScopedBlockSessionCheckouts in scope, it will block checking out sessions, which could cause us to fail to apply a commitTransaction oplog entry, ultimately triggering this fassert. |
| Comments |
| Comment by Githook User [ 21/Jul/21 ] |
|
Author: {'name': 'Wenbin Zhu', 'email': 'wenbin.zhu@mongodb.com', 'username': 'WenbinZhu'}Message: (cherry picked from commit 8588a5a3a52f17026b1e5a21c00e815fbb702a7c) |
| Comment by Githook User [ 07/Jul/21 ] |
|
Author: {'name': 'Wenbin Zhu', 'email': 'wenbin.zhu@mongodb.com', 'username': 'WenbinZhu'}Message: |
| Comment by Wenbin Zhu [ 01/Jul/21 ] |
|
After talking to sharding team, I think we can revert Also as mentioned in the previous comment, the reason I think the 4.2/4.4 workaround is only a best effort is we can construct sequences that still produce deadlock. The workaround uses inserterOpCtx->setAlwaysInterruptAtStepDownOrUp(), but inserterOpCtx is not the one that checked out the session that stepdown is waiting on (recall this is a 3-way deadlock). Only the OpCtx that checked out a session will be marked killed by the stepdown thread. So if stepdown is run after this check, the 3-way deadlock can still happen. This is regardless of whether we backport |
| Comment by Wenbin Zhu [ 28/Jun/21 ] |
|
renctan I think with CancelableOpCtx, the deadlock cannot happen because before stepdown thread waits for session checkout, it marked OpCtx as killed, and a CancelableOpCtx will also be killed/interrupted if it was created with a cancellation token from the OpCtx that is marked killed. So when the migration thread wants to acquire the RSTL while holding the session, it will fail because its OpCtx (cancellable) is marked killed with error code `InterruptedDueToReplStateChange`, breaking the deadlock. But drop collection is not interruptible, causing the deadlock issue again. |
| Comment by Randolph Tan [ 28/Jun/21 ] |
|
Hm... Wouldn't CancelableOpCtx also run into the same issue you mentioned about the stepdown thread requiring to wait for checked out sessions? Since the stepDown thread can't check the session, it won't be able to kill it. |
| Comment by Wenbin Zhu [ 28/Jun/21 ] |
|
Because of this issue, we started re-investigating BF-19260 which introduced This deadlock was also discovered in That fix made us rethink if I ran some patch builds after reverting renctan pierlauro.sciarelli any thoughts on this? |
| Comment by Wenbin Zhu [ 22/Jun/21 ] |
|
I think there is another problem with ScopedBlockSessionCheckouts in case of concurrent stepdown.
Now even though thread 2 is still inside this block, we again allow checking out sessions because step 3 resets the flag, which is against the purpose of ScopedBlockSessionCheckouts. If this happens, the deadlock problem in This problem seems to have a simple fix by using a counter instead of the boolean value for _checkoutAllowed, but we might need a cleaner solution that can handle both problems. |
| Comment by Steven Vannelli [ 16/Jun/21 ] |
|
samy.lanka I'm going through and requesting 5.0 Backports for all 5.0 Hot BFs in WFBF. Feel free to update / remove post-rc0 label as you see fit. |