[SERVER-32661] Extended PV1 arbiter primeality test to all secondaries Created: 11/Jan/18  Updated: 27/Oct/23  Resolved: 19/Jan/18

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Kevin Arhelger Assignee: Spencer Brody (Inactive)
Resolution: Works as Designed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Participants:
Case:

 Description   

Under PV1 arbiters check if they can see a primary (src/mongo/db/repl/topology_coordinator_impl.cpp line 2580) but secondaries would vote in this scenaro.

Under certain rare conditions, this can cause flapping.

1. Netsplit
2. Few writes combined with replication lag keep secondaries from moving their optime ahead.
3. Secondary runs for election
4. Enough secondaries vote from new primary.
5. Election occurs
6. Old primary runs priority take over

We request this test be extended to all hosts.



 Comments   
Comment by Spencer Brody (Inactive) [ 19/Jan/18 ]

This behavior is as-designed.

A node voting no in an election for a candidate who is otherwise a totally viable candidate to be primary, solely because it believes (based it its fundamentally outdated view of the world) there to be another primary to be up risks delaying elections when there is an actual failure, resulting in longer periods of write unavailability for no reason.

Even worse, it can actually put the system into a state of indefinite write unavailability for the duration of a network partition. Imagine you have 3 nodes: A, B, and C. Node A is currently primary. Then there's a netsplit where node A gets isolated from both the client as well as node C, but can still see node B. Currently in that scenario, node C will run for election and win, restoring write availability. If node B refused to vote for node C just because it can see node A, you'd be stuck without any primary that the client can reach to send writes to.

The situation you describe only results in an unnecessary election when priorities are in use. By electing to use priorities, you are indicating to the system that having one specific node as primary is more important to you than avoiding elections. In the case you describe, if you are okay with the primary that was elected in step 4 remaining primary, then it should be configured with the same priority as the original primary.

Generated at Thu Feb 08 04:30:54 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.