[SERVER-84709] Resharding critical section timeout is not honored on stepdown Created: 09/Jan/24 Updated: 23/Jan/24 |
|
| Status: | Backlog |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Allison Easton | Assignee: | Adi Zaimi |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | cs-subteam3 | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
| Assigned Teams: |
Cluster Scalability
|
| Operating System: | ALL |
| Steps To Reproduce: | The attached repro is not perfect since it assumes that the stepdown will happen before the timeout is hit, but it has reproduced the problem pretty consistently in my environment. |
| Participants: |
| Description |
|
The reshardingCriticalSectionTimeoutMillis parameter is intended to bound the amount of time that the critical section will be held during resharding. This is implemented by scheduling a callback which sets an error if the timeout is exceeded. However, this is a local callback that is scheduled, and it seems as though it is never re-scheduled in the case of stepdown so the timeout parameter will be ignored after a stepdown occurs. |