[SERVER-82883] Recovering TransactionCoordinator on stepup may block acquiring read/write tickets while participants are in the prepared state Created: 07/Nov/23 Updated: 01/Jan/24 Resolved: 12/Dec/23 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | 4.4.0, 5.0.0, 6.0.0, 7.0.0 |
| Fix Version/s: | 7.3.0-rc0, 7.0.5, 6.0.13, 5.0.24, 4.4.28 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Jordi Serra Torrens | Assignee: | Wenqin Ye |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||
| Assigned Teams: |
Cluster Scalability
|
||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||
| Backport Requested: |
v7.0, v6.0, v5.0, v4.4
|
||||||||||||||||||||
| Sprint: | Cluster Scalability 2023-11-27, Cluster Scalability 2023-12-11, Cluster Scalability 2023-12-25 | ||||||||||||||||||||
| Participants: | |||||||||||||||||||||
| Linked BF Score: | 155 | ||||||||||||||||||||
| Description |
|
Consider a TransactionCoordinator that has sent the prepare command to the participants and then crashes. The new primary, on stepup, will resume the coordination. There are several points at which this can stall behind a read/write ticket acquisition. This is undesirable, both for performance and because it can cause deadlocks. Ticket acquisitions occur at:
In addition to not skipping ticket acquisition, (1) and (3) do not skip FlowControl either. |
| Comments |
| Comment by Githook User [ 27/Dec/23 ] |
|
Author: {'name': 'Wenqin Ye', 'email': 'wenqin908@gmail.com', 'username': 'wenqinYe'}Message: GitOrigin-RevId: 24a5efab312e5f8c8253e07e05910fd0ed93e1a2 |
| Comment by Githook User [ 26/Dec/23 ] |
|
Author: {'name': 'Wenqin Ye', 'email': 'wenqin908@gmail.com', 'username': 'wenqinYe'}Message: GitOrigin-RevId: e75ba7014caabc5c0a2296fa5105387ecd6c51c6 |
| Comment by Githook User [ 15/Dec/23 ] |
|
Author: {'name': 'Wenqin Ye', 'email': 'wenqin908@gmail.com', 'username': 'wenqinYe'}Message: GitOrigin-RevId: b17eab4d2a58b8486cdb357398c33c0108f404cc |
| Comment by Githook User [ 15/Dec/23 ] |
|
Author: {'name': 'Wenqin Ye', 'email': 'wenqin908@gmail.com', 'username': 'wenqinYe'}Message: GitOrigin-RevId: 125838b48ea42921da92e99a550faefc895a1755 |
| Comment by Githook User [ 11/Dec/23 ] |
|
Author: {'name': 'Wenqin Ye', 'email': 'wenqin908@gmail.com', 'username': 'wenqinYe'}Message: GitOrigin-RevId: fe92af8bb9da945673722a7a4115401cfa95a6ff |
| Comment by Josef Ahmad [ 08/Nov/23 ] |
|
I know little about the transaction coordinator, but assuming that the majority of the work it does is on the critical path for retiring prepared transactions, would it make sense to exempt the coordinator from acquiring tickets altogether? It could be a more sustainable approach than patching up individual storage accesses. |