[SERVER-77633] Calling withTransaction with a checked out session may end up in a deadlock (on stepdown) Created: 31/May/23 Updated: 22/Jun/23 Resolved: 22/Jun/23 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | Silvia Surroca | Assignee: | Silvia Surroca |
| Resolution: | Won't Do | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||
| Assigned Teams: |
Sharding EMEA
|
||||||||||||||||||||
| Sprint: | Sharding EMEA 2023-06-12, Sharding EMEA 2023-06-26 | ||||||||||||||||||||
| Participants: | |||||||||||||||||||||
| Linked BF Score: | 135 | ||||||||||||||||||||
| Description |
|
Any code running `withTransaction` may end up with a deadlock if the given OperationContext holds a SessionId and there is a stepdown during the transaction process. Right now we don't have any thread that holds a session when `withTransaction` is called, however, it should be fixed to avoid hitting this error in the future. The sequence of events leading to a deadlock is the following:
1. ( withTransaction is a method implemented as a utility for the ShardingCatalogManager when new transactions API didn't exist. The new transaction API yields the session attached to the thread to avoid this scenario. So I suggest getting rid of withTransaction code and using the new transaction API instead. This is an example of implementation for the new transaction API This issue was discovered when the sessionId was attached to the ConfigsvrCollMod request. The sessionId was finally removed to solve quickly the bug. |
| Comments |
| Comment by Silvia Surroca [ 22/Jun/23 ] |
|
We've decided to don't address this ticket since any operation is using the withTransaction utility holding a session. On one side, the user of withTransaction should yield any resource in case of holding them. On the other side, we are migrating the code toward the new transaction API, where this problem is not present any more. |