[SERVER-45116]  replSetStepDown attempts to hand off election to highest priority node Created: 12/Dec/19  Updated: 06/Dec/22  Resolved: 10/Feb/20

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: None

Type: New Feature Priority: Major - P3
Reporter: Tess Avitabile (Inactive) Assignee: Backlog - Replication Team
Resolution: Won't Do Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-43904 When stepping down, step up doesn't f... Closed
Assigned Teams:
Replication
Participants:
Case:

 Description   

There are certain scenarios where priority takeover never succeeds. For example, if a user has 3 nodes in the main datacenter and 2 nodes in the backup datacenter. They want to move the primary from the main datacenter to the backup datacenter, so they raise the priority of a node in the backup datacenter. However, the nodes in the backup datacenter are always behind the main datacenter, so they can never win an election.

A proposed fix to this is to let replSetStepDown accept a suggested next primary. When the command waits for a majority of nodes to catch up, it also waits for the suggested next primary to catch up. It then sends the replSetStepUp command to the suggested next primary.



 Comments   
Comment by Evin Roesle [ 10/Feb/20 ]

Closing this project as Won't Do. If you would like to elect a specific node, please utilize the replSetFreeze command to freeze the nodes that you do not want to elect and then run the replSetStepDown command. If this is not a suitable solution, please reopen this ticket. 

Comment by Siyuan Zhou [ 30/Jan/20 ]

To follow up on the first issue tess.avitabile pointed out. SERVER-43904 is tracking the replSetFreeze bug.

Comment by Tess Avitabile (Inactive) [ 30/Jan/20 ]

Summary of discussion with evin.roesle, siyuan.zhou, and judah.schvimer:

  • There is a workaround for this problem today, which is to run the replSetFreeze command as described in the documentation. A downside to using the replSetFreeze command is that election handoff does not take into account whether a node is frozen, which leads to unnecessary downtime, since the replSetStepUp command for election handoff will fail if run against a frozen node. This is something we could just fix.
  • In cases where a priority takeover would happen, doing this ticket will not result in additional downtime. For priority takeover, there is downtime during primary catchup. This ticket would move the waiting to the replSetStepDown command, but the amount of waiting would be the same.
  • In cases where a priority takeover cannot happen because the highest-priority node is behind a majority, doing this ticket would result in additional downtime. This is a tradeoff between availability and converging on a user's priorities. However, the user still has the ability to control the amount of downtime by configuring the number of seconds specified in the replSetStepDown command.
Comment by Tess Avitabile (Inactive) [ 02/Jan/20 ]

A Replication team member suggested that instead of letting the user suggest the next primary, the stepping down node should use the highest-priority node as the suggested next primary.

Generated at Thu Feb 08 05:07:56 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.