[SERVER-26226] Disallow replica set configurations with 3 nodes, one arbiter, priorities greater than 1, and protocol version 1 Created: 21/Sep/16  Updated: 06/Dec/22  Resolved: 31/Oct/16

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: 3.2.9
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Eric Milkie Assignee: Backlog - Replication Team
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Replication
Participants:

 Description   

Typically, users running a 3 node replica set with one arbiter will use w:1 writes instead of w:majority writes, since the cluster loses quorum after one node fails. Any writes taking place on a degraded cluster with only one data-bearing node carry the risk of losing those writes.
In particular, rollback can occur after the cluster is healed, by returning the second data-bearing node to the cluster. At this point, under pv1, priority takeover can roll back all writes done during the degraded state. Protocol version 0 priority takeover, however, can roll back up to 10 seconds of writes.

Due to the potential size of the rollback, and the unexpectedness of the behavior, we propose to prohibit the configuration of 3 nodes, 1 arbiter, priorities higher than 1, and protocol version 1. Upon parsing of such a configuration, we should degrade to protocol version 0, and add a warning to the system log.



 Comments   
Comment by Spencer Brody (Inactive) [ 31/Oct/16 ]

SERVER-26748 eliminates the need for this

Comment by Eric Milkie [ 29/Sep/16 ]

To expedite work on this, we may first simply implement warnings for this situation, and then follow up with prohibiting reconfigs that introduce an illegal configuration.

Comment by Andy Schwerin [ 21/Sep/16 ]

If we do this, I'd also like explicit reconfigs and initiates to fail if they involve an illegal configuration.

Comment by Andy Schwerin [ 21/Sep/16 ]

Should we just prohibit arbiters in PV1 entirely?

Generated at Thu Feb 08 04:11:29 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.