-
Type:
Bug
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
Replication
-
ALL
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Suppose we have a replica set where the primary node is degraded. The customer may decide to send a replSetStepDown command to the primary, which would then send a replSetStepUp (election hand off) to a suitable node to minimize the amount of time without a primary. Otherwise, the other nodes through heartbeat timeouts, would have to realize that there isn't a primary and would then have to run an election.
However, in some cases of degradation, the primary may be so degraded that it responds to the replSetStepDown late and by the time it sends a replSetStepUp to the other nodes, the other nodes have already run an election and elected a primary.
However, because the primary sends replSetStepUp with skipDryRun: true, the replica set will be disrupted because the election process is kicked off again, unconditionally, even if the other nodes have already run an election and decided on a new primary.
This is to try to help in that edge case, possible solutions are attaching some term info with the replSetStepUp as an indicator of how stale the request is, or by omitting skipDryRun:true (which would have the downside of longer time before a primary is elected).
See linked issue.