[SERVER-16597] Warn if no voting data bearing members Created: 18/Dec/14  Updated: 06/Dec/22  Resolved: 16/Mar/17

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: 2.8.0-rc2
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Kevin Pulo Assignee: Backlog - Replication Team
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates SERVER-17528 if votes>0, priority must be >0 Closed
Related
related to SERVER-14403 Change w:majority write concern to in... Closed
Assigned Teams:
Replication
Participants:

 Description   

I have no idea what w:majority is supposed to mean in a replica set where only arbiters can vote, but it can't be good. When I tested this, a w:majority write was acknowledged by the primary even though all the secondaries were down at the time.

Since such a replset config is highly likely to be (a) unsafe and (b) a mistake, we should output a warning if this situation is detected at replset initiate/reconfig time. We should also check to see if there are any other strange replset configs that users should be warned about.



 Comments   
Comment by Kevin Pulo [ 16/Mar/17 ]

Yes, I believe so — something like priority 1 votes 0, priority 0 votes 0, arbiter. In 3.2 and higher this is prohibited as an invalid config (SERVER-17528), and trying to initiate or reconfig with such a config returns errmsg: "priority must be 0 when non-voting (votes:0)". I confirmed that this is the case by retesting just now against 3.4.2 and 3.2.12.

However, 3.0.14 accepts this config, and has the symptoms originally described, where w:majority writes are acknowledged (when I would expect them to fail or "hang"). So the v3.0 branch is affected.

> rs.initiate(c)
{ "ok" : 1 }
replset:OTHER> rs.status()
{
	"set" : "replset",
	"date" : ISODate("2017-03-16T03:01:34.706Z"),
	"myState" : 1,
	"members" : [
		{
			"_id" : 0,
			"name" : "devkev-1:12345",
			"health" : 1,
			"state" : 1,
			"stateStr" : "PRIMARY",
			"uptime" : 90,
			"optime" : Timestamp(1489633262, 1),
			"optimeDate" : ISODate("2017-03-16T03:01:02Z"),
			"electionTime" : Timestamp(1489633266, 1),
			"electionDate" : ISODate("2017-03-16T03:01:06Z"),
			"configVersion" : 1,
			"self" : true,
			"repllag" : 0,
			"repllagHrs" : 0
		},
		{
			"_id" : 1,
			"name" : "devkev-1:12346",
			"health" : 1,
			"state" : 2,
			"stateStr" : "SECONDARY",
			"uptime" : 31,
			"optime" : Timestamp(1489633262, 1),
			"optimeDate" : ISODate("2017-03-16T03:01:02Z"),
			"lastHeartbeat" : ISODate("2017-03-16T03:01:33.006Z"),
			"lastHeartbeatRecv" : ISODate("2017-03-16T03:01:33.009Z"),
			"pingMs" : 0,
			"lastHeartbeatMessage" : "could not find member to sync from",
			"configVersion" : 1,
			"repllag" : 0,
			"repllagHrs" : 0
		},
		{
			"_id" : 2,
			"name" : "devkev-1:12347",
			"health" : 1,
			"state" : 7,
			"stateStr" : "ARBITER",
			"uptime" : 31,
			"lastHeartbeat" : ISODate("2017-03-16T03:01:33.006Z"),
			"lastHeartbeatRecv" : ISODate("2017-03-16T03:01:33.011Z"),
			"pingMs" : 0,
			"configVersion" : 1
		}
	],
	"ok" : 1
}
replset:PRIMARY> rs.conf()
{
	"_id" : "replset",
	"version" : 1,
	"members" : [
		{
			"_id" : 0,
			"host" : "devkev-1:12345",
			"arbiterOnly" : false,
			"buildIndexes" : true,
			"hidden" : false,
			"priority" : 1,
			"tags" : {
 
			},
			"slaveDelay" : 0,
			"votes" : 0
		},
		{
			"_id" : 1,
			"host" : "devkev-1:12346",
			"arbiterOnly" : false,
			"buildIndexes" : true,
			"hidden" : false,
			"priority" : 0,
			"tags" : {
 
			},
			"slaveDelay" : 0,
			"votes" : 0
		},
		{
			"_id" : 2,
			"host" : "devkev-1:12347",
			"arbiterOnly" : true,
			"buildIndexes" : true,
			"hidden" : false,
			"priority" : 1,
			"tags" : {
 
			},
			"slaveDelay" : 0,
			"votes" : 1
		}
	],
	"settings" : {
		"chainingAllowed" : true,
		"heartbeatTimeoutSecs" : 10,
		"getLastErrorModes" : {
 
		},
		"getLastErrorDefaults" : {
			"w" : 1,
			"wtimeout" : 0
		}
	}
}
replset:PRIMARY> db.version()
3.0.14
replset:PRIMARY> db.test.insert({})
WriteResult({ "nInserted" : 1 })
replset:PRIMARY> db.test.insert({}, { writeConcern: { w: "majority" } } )
WriteResult({ "nInserted" : 1 })
replset:PRIMARY> db.test.insert({}, { writeConcern: { w: 2 } } )
WriteResult({ "nInserted" : 1 })
replset:PRIMARY> db.test.insert({}, { writeConcern: { w: 3 } } )
WriteResult({
	"nInserted" : 1,
	"writeConcernError" : {
		"code" : 100,
		"errmsg" : "Not enough data-bearing nodes"
	}
})
replset:PRIMARY>
replset:PRIMARY>
replset:PRIMARY> rs.status()
{
	"set" : "replset",
	"date" : ISODate("2017-03-16T03:07:38.327Z"),
	"myState" : 1,
	"members" : [
		{
			"_id" : 0,
			"name" : "devkev-1:12345",
			"health" : 1,
			"state" : 1,
			"stateStr" : "PRIMARY",
			"uptime" : 454,
			"optime" : Timestamp(1489633383, 1),
			"optimeDate" : ISODate("2017-03-16T03:03:03Z"),
			"electionTime" : Timestamp(1489633266, 1),
			"electionDate" : ISODate("2017-03-16T03:01:06Z"),
			"configVersion" : 1,
			"self" : true,
			"repllag" : 0,
			"repllagHrs" : 0
		},
		{
			"_id" : 1,
			"name" : "devkev-1:12346",
			"health" : 0,
			"state" : 8,
			"stateStr" : "(not reachable/healthy)",
			"uptime" : 0,
			"optime" : Timestamp(0, 0),
			"optimeDate" : ISODate("1970-01-01T00:00:00Z"),
			"lastHeartbeat" : ISODate("2017-03-16T03:07:37.438Z"),
			"lastHeartbeatRecv" : ISODate("2017-03-16T03:07:27.070Z"),
			"pingMs" : 0,
			"lastHeartbeatMessage" : "Failed attempt to connect to devkev-1:12346; couldn't connect to server devkev-1:12346 (172.31.0.248), connection attempt failed",
			"configVersion" : -1,
			"repllag" : 1489633383,
			"repllagHrs" : 413787.05083333334
		},
		{
			"_id" : 2,
			"name" : "devkev-1:12347",
			"health" : 1,
			"state" : 7,
			"stateStr" : "ARBITER",
			"uptime" : 395,
			"lastHeartbeat" : ISODate("2017-03-16T03:07:37.075Z"),
			"lastHeartbeatRecv" : ISODate("2017-03-16T03:07:37.095Z"),
			"pingMs" : 0,
			"configVersion" : 1
		}
	],
	"ok" : 1
}
replset:PRIMARY>
replset:PRIMARY>
replset:PRIMARY> db.test.insert({})
WriteResult({ "nInserted" : 1 })
replset:PRIMARY> db.test.insert({}, { writeConcern: { w: "majority" } } )
WriteResult({ "nInserted" : 1 })
replset:PRIMARY> db.test.insert({}, { writeConcern: { w: 3 } } )
WriteResult({
	"nInserted" : 1,
	"writeConcernError" : {
		"code" : 100,
		"errmsg" : "Not enough data-bearing nodes"
	}
})
replset:PRIMARY> db.test.insert({}, { writeConcern: { w: 2 } } )
^C
do you want to kill the current op(s) on the server? (y/n): y

Comment by Eric Milkie [ 23/Nov/16 ]

kevin.pulo when you did your test, how did you manage to have a primary in your set – did you have a node with priority 1 and 0 votes?

Generated at Thu Feb 08 03:41:37 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.