[SERVER-8483] reconfig may cause problem re-electing primary Created: 17/Jan/13  Updated: 29/Jan/15  Resolved: 29/Jan/15

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Greg Studer Assignee: Davide Italiano
Resolution: Incomplete Votes: 0
Labels: elections
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Text File currentTest.txt     Text File currentTest_failure_same_host_veto.txt     File sync_change_source.js    
Issue Links:
Related
Operating System: ALL
Participants:

 Description   

Setup is this:

Replica set with 4 nodes, priority 0 except for the first node A (only the first node can be primary).

Nodes B and C slaveDelayed by 0 or 40s, alternating via reconfigs.

Node D blackholed from node A, symmetrically (A can't talk to D, D can't talk to A).

At first node D correctly switches sync'ing between nodes A and B, depending on which is delayed. Each time the reconfig happens node A drops to secondary, then is elected primary.

At some point though it seems impossible for node A to become the primary again after a reconfig. There is a strange message in the logs of node A:

 m31000| Thu Jan 17 17:05:00.147 [rsMgr] not electing self, 127.0.0.4:31002 would veto with '127.0.0.2:31000 is trying to elect itself but 127.0.0.2:31000 is already primary and more up-to-date'

Test to reproduce and output from two runs is attached below (with replSetStatus from all nodes every 5s during the problem period).



 Comments   
Comment by Eric Milkie [ 28/Aug/14 ]

We should consider the jstests in this ticket for doing testing of the refactored RS code.

Comment by Greg Studer [ 18/Jan/13 ]

http://codereview.10gen.com/9191135

has the changes I'm using. Requires a kernel with netem support and the "tc" traffic control binary - the javascript binding is dumb and done via bash (but is much easier than remembering all the parameters).

Also includes a patch for mongod that binds the source endpoint of a connection to a particular localhost address. Otherwise all source connections look the same and are hard to select.

Comment by Kristina Chodorow (Inactive) [ 18/Jan/13 ]

Can you attach the libs/helpers.js file so I can play around with this?

Comment by Greg Studer [ 17/Jan/13 ]

What's also strange is the replSetStatus reported by all hosts when node A can't elect itself - node A's status looks like :

{
	"set" : "testReplSet",
	"date" : ISODate("2013-01-17T22:04:00Z"),
	"myState" : 2,
	"members" : [
		{
			"_id" : 0,
			"name" : "127.0.0.2:31000",
			"health" : 1,
			"state" : 2,
			"stateStr" : "SECONDARY",
			"uptime" : 475,
			"optime" : {
				"t" : 1358460220000,
				"i" : 1
			},
			"optimeDate" : ISODate("2013-01-17T22:03:40Z"),
			"self" : true
		},
		{
			"_id" : 1,
			"name" : "127.0.0.3:31001",
			"health" : 1,
			"state" : 2,
			"stateStr" : "SECONDARY",
			"uptime" : 20,
			"optime" : {
				"t" : 1358460155000,
				"i" : 1
			},
			"optimeDate" : ISODate("2013-01-17T22:02:35Z"),
			"lastHeartbeat" : ISODate("2013-01-17T22:03:40Z"),
			"lastHeartbeatRecv" : ISODate("2013-01-17T22:03:59Z"),
			"pingMs" : 0,
			"lastHeartbeatMessage" : "db exception in producer: 10278 dbclient error communicating with server: 127.0.0.2:31000"
		},
		{
			"_id" : 2,
			"name" : "127.0.0.4:31002",
			"health" : 1,
			"state" : 2,
			"stateStr" : "SECONDARY",
			"uptime" : 20,
			"optime" : {
				"t" : 1358460155000,
				"i" : 1
			},
			"optimeDate" : ISODate("2013-01-17T22:02:35Z"),
			"lastHeartbeat" : ISODate("2013-01-17T22:03:40Z"),
			"lastHeartbeatRecv" : ISODate("2013-01-17T22:03:59Z"),
			"pingMs" : 0,
			"lastHeartbeatMessage" : "db exception in producer: 10278 dbclient error communicating with server: 127.0.0.2:31000"
		},
		{
			"_id" : 3,
			"name" : "127.0.0.5:31003",
			"health" : -1,
			"state" : 6,
			"stateStr" : "UNKNOWN",
			"uptime" : 0,
			"optime" : {
				"t" : 0,
				"i" : 0
			},
			"optimeDate" : ISODate("1970-01-01T00:00:00Z"),
			"lastHeartbeat" : ISODate("1970-01-01T00:00:00Z"),
			"lastHeartbeatRecv" : ISODate("1970-01-01T00:00:00Z"),
			"pingMs" : 0
		}
	],
	"ok" : 1
}

while the status from nodeB (127.0.0.3:31001) is :

{
	"set" : "testReplSet",
	"date" : ISODate("2013-01-17T22:04:00Z"),
	"myState" : 2,
	"syncingTo" : "127.0.0.2:31000",
	"members" : [
		{
			"_id" : 0,
			"name" : "127.0.0.2:31000",
			"health" : 1,
			"state" : 2,
			"stateStr" : "SECONDARY",
			"uptime" : 19,
			"optime" : {
				"t" : 1358460220000,
				"i" : 1
			},
			"optimeDate" : ISODate("2013-01-17T22:03:40Z"),
			"lastHeartbeat" : ISODate("2013-01-17T22:03:59Z"),
			"lastHeartbeatRecv" : ISODate("2013-01-17T22:04:00Z"),
			"pingMs" : 0
		},
		{
			"_id" : 1,
			"name" : "127.0.0.3:31001",
			"health" : 1,
			"state" : 2,
			"stateStr" : "SECONDARY",
			"uptime" : 474,
			"optime" : {
				"t" : 1358460220000,
				"i" : 1
			},
			"optimeDate" : ISODate("2013-01-17T22:03:40Z"),
			"errmsg" : "syncing to: 127.0.0.2:31000",
			"self" : true
		},
		{
			"_id" : 2,
			"name" : "127.0.0.4:31002",
			"health" : 1,
			"state" : 2,
			"stateStr" : "SECONDARY",
			"uptime" : 19,
			"optime" : {
				"t" : 1358460155000,
				"i" : 1
			},
			"optimeDate" : ISODate("2013-01-17T22:02:35Z"),
			"lastHeartbeat" : ISODate("2013-01-17T22:03:59Z"),
			"lastHeartbeatRecv" : ISODate("2013-01-17T22:03:59Z"),
			"pingMs" : 0,
			"lastHeartbeatMessage" : "syncing to: 127.0.0.2:31000"
		},
		{
			"_id" : 3,
			"name" : "127.0.0.5:31003",
			"health" : 1,
			"state" : 2,
			"stateStr" : "SECONDARY",
			"uptime" : 19,
			"optime" : {
				"t" : 1358460220000,
				"i" : 1
			},
			"optimeDate" : ISODate("2013-01-17T22:03:40Z"),
			"lastHeartbeat" : ISODate("2013-01-17T22:03:59Z"),
			"lastHeartbeatRecv" : ISODate("1970-01-01T00:00:00Z"),
			"pingMs" : 0,
			"lastHeartbeatMessage" : "syncing to: 127.0.0.3:31001"
		}
	],
	"ok" : 1
}

There are no errors in node B's node messages, but node A has errors each time near the failure.

Comment by Greg Studer [ 17/Jan/13 ]

we should discuss, just wanted to put details here. Maybe I can show you the setup, if this isn't trivial.

Generated at Thu Feb 08 03:17:33 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.