[SERVER-76680] Set a long election timeout for shard split passthroughs Created: 28/Apr/23 Updated: 02/May/23 Resolved: 02/May/23 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Didier Nadeau | Assignee: | Mathis Bessa |
| Resolution: | Won't Do | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||
| Assigned Teams: |
Serverless
|
||||
| Operating System: | ALL | ||||
| Sprint: | Server Serverless 2023-05-15 | ||||
| Participants: | |||||
| Linked BF Score: | 10 | ||||
| Description |
|
Spurious elections on the recipient set might cause shard split passthrough. If there's a failover between stepping up the recipient primary and appending the oplog note (tasks done by the shard split service) appending the oplog note won't succeed and shard split will fail. We should explicitly set an high timeout for shard split passthrough to reduce the likelyhood of that happening. |
| Comments |
| Comment by Mathis Bessa [ 02/May/23 ] |
|
Closing this since shard split passthroughs are already defaulting |
| Comment by Didier Nadeau [ 01/May/23 ] |
|
Additional context : this scenario (election between replSetStepUp and writing oplog note) occured in a JS test in BF-28002. We expect this same scenario can also happen in passthrough so we want to increase the timeout in split passthroughs. |