Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Done
Priority: Major - P3
Fix Version/s: None
Affects Version/s: 2.4.9
Component/s: Replication
Labels:
- ElecENH
- majority

Assigned Teams:

Replication
Operating System:
ALL
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

It seems like the change in replica set majority calculation introduced in ~~SERVER-5351~~ broke balancing on some existing cluster setups, since it bases the strict majority on the total number of members, not the number of non-arbiter ones.

We recently upgraded a cluster from v2.2.4 to v2.4.9, and lost our ability to balance the cluster in its original setup.

The cluster has 20 shards, and each shard is a replica set with four members: a primary, a secondary and an arbiter in one datacenter, and a non-voting, zero-priority, hidden secondary with a 12-hour replication delay in another datacenter.

After the upgrade, balancing the cluster failed since it was waiting for the operations to replicate to a majority (3 out of 4) of the replica set members, rather than a majority of the non-arbiter members (2 out of 3). With the third non-arbiter member being on a 12-hour delay, that didn't go very well. I expect the same would happen on individual shards if either storage member had become unavailable.

(As a temporary fix to get the balancing going again, we removed the replication delay to the off-site secondary.)

Not sure if this is the same issue as ~~SERVER-12386~~, or just related to it.

is duplicated by

SERVER-12386 Use of arbiters prevents fault-tolerant w:majority writes

Closed

is related to

SERVER-14403 Change w:majority write concern to indicate a majority of voting nodes

Closed

SERVER-7681 Report majority number in ReplSetGetStatus/isMaster

Closed

related to

SERVER-15764 unit test new majority write behavior in ReplicationCoordinator

Closed

Assignee:: [DO NOT USE] Backlog - Replication Team
Reporter:: Filip Salomonsson
Participants:: [DO NOT USE] Backlog - Replication Team, Asya Kamsky, Eric Milkie, Filip Salomonsson
Votes:: 1 Vote for this issue
Watchers:: 12 Start watching this issue

Created:: Mar 06 2014 10:21:43 AM UTC
Updated:: Apr 07 2023 02:33:02 PM UTC
Resolved:: Dec 21 2015 09:15:30 PM UTC

Details

Description

Attachments

Issue Links

Activity

People

Dates