[SERVER-36503] Skip dry-run election during election handoff Created: 07/Aug/18 Updated: 29/Oct/23 Resolved: 12/Sep/18 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | None |
| Fix Version/s: | 3.6.9, 4.0.3, 4.1.4 |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | Spencer Brody (Inactive) | Assignee: | Vesselina Ratcheva (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||
| Backwards Compatibility: | Fully Compatible | ||||
| Backport Requested: |
v4.0, v3.6
|
||||
| Sprint: | Repl 2018-08-27, Repl 2018-09-10, Repl 2018-09-24 | ||||
| Participants: | |||||
| Description |
|
With the new election handoff machinery, when a primary is sent replSetStepDown, the last thing it does as part of stepDown is send a replSetStepUp command to a node it believes to be best equipped to take over as the new primary. The stepUp command makes the new node call for an election, going through the normal election path which includes a dry-run election. Dry run elections aren't required for correctness, they are an optimization to limit unnecessary term changes due to failed elections. But since we already believe this node has a pretty good chance to win, and because the primary is stepping down so there's going to be a term change anyway, we could skip the dry run on the new primary candidate node and still have a good chance to win. This would reduce failover time in the common case for planned failover, at the risk of inducing extra unnecessary elections in the degenerate case. |
| Comments |
| Comment by Githook User [ 19/Sep/18 ] |
|
Author: {'name': 'Vesselina Ratcheva', 'email': 'vesselina.ratcheva@mongodb.com'}Message: (cherry picked from commit b19e39088cf8754186de8f5f3f1dae17a12aaa4c) |
| Comment by Githook User [ 12/Sep/18 ] |
|
Author: {'name': 'Vesselina Ratcheva', 'email': 'vesselina.ratcheva@mongodb.com'}Message: (cherry picked from commit b19e39088cf8754186de8f5f3f1dae17a12aaa4c) |
| Comment by Githook User [ 12/Sep/18 ] |
|
Author: {'name': 'Vesselina Ratcheva', 'email': 'vesselina.ratcheva@mongodb.com'}Message: |
| Comment by Andy Schwerin [ 17/Aug/18 ] |
|
That sounds good to me, then. |
| Comment by Siyuan Zhou [ 16/Aug/18 ] |
|
Yes, an argument for "skip the dry run". The back-compatibility is good point. It looks like we don't parse the command currently, so I expect it'll be safe for old versions. |
| Comment by Andy Schwerin [ 16/Aug/18 ] |
|
Is there an argument we could add that would be ignored on old versions of mongod, rather than erroring? That would make stepUp work more seamlessly in mixed-version systems. I assume you want an argument that means "skip the dry run", not an argument that means "do not skip the dry run", so that users in the shell get the safer behavior with the shorter command invocation? siyuan.zhou |
| Comment by Siyuan Zhou [ 16/Aug/18 ] |
|
schwerin proposed to skip dry-run on all stepUp command. I'm concerned that the change would make stepUp command very hard to use manually via a shell, unless the user checks the replication progress periodically and manually, otherwise they risk interrupting the primary with a failed election. The command is not documented, but might be useful to Support and us for manual testing. I'd propose to add a new parameter to the command. CC spencer. |