Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Unresolved
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:
- repl-shortlist

Assigned Teams:

Replication
Operating System:
ALL
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

Suppose we have a replica set where the primary node is degraded. The customer may decide to send a replSetStepDown command to the primary, which would then send a replSetStepUp (election hand off) to a suitable node to minimize the amount of time without a primary. Otherwise, the other nodes through heartbeat timeouts, would have to realize that there isn't a primary and would then have to run an election.

However, in some cases of degradation, the primary may be so degraded that it responds to the replSetStepDown late and by the time it sends a replSetStepUp to the other nodes, the other nodes have already run an election and elected a primary.

However, because the primary sends replSetStepUp with skipDryRun: true, the replica set will be disrupted because the election process is kicked off again, unconditionally, even if the other nodes have already run an election and decided on a new primary.

This is to try to help in that edge case, possible solutions are attaching some term info with the replSetStepUp as an indicator of how stale the request is, or by omitting skipDryRun:true (which would have the downside of longer time before a primary is elected).

See linked issue.

Assignee:: Unassigned
Reporter:: Vishnu Kaushik
Participants:: Vishnu Kaushik
Votes:: 0 Vote for this issue
Watchers:: 6 Start watching this issue

Created:: May 04 2026 03:05:39 PM UTC
Updated:: May 04 2026 05:54:17 PM UTC

Details

Description

Attachments

Activity

People

Dates