[SERVER-5717] Secondaries continue synchingTo unresponsive replica set member Created: 27/Apr/12  Updated: 15/Aug/12  Resolved: 08/Aug/12

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 2.0.4
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Jeffrey Yemin Assignee: Eric Milkie
Resolution: Done Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Operating System: ALL
Participants:

 Description   

To reproduce:

  1. Create replica set with two secondaries (I actually tested this with three, but should work with two)
  2. kill -SIGSTOP the primary
  3. Wait for a new primary to be elected
  4. Run rs.status() on the remaining secondary

It will show that the old primary is in state:8, but that it is still syncingTo it, e.g.

SECONDARY> rs.status()
{
	"set" : "test",
	"date" : ISODate("2012-04-27T19:21:08Z"),
	"myState" : 2,
	"syncingTo" : "127.0.0.1:27018",
	"members" : [
		{
			"_id" : 0,
			"name" : "127.0.0.1:27017",
			"health" : 1,
			"state" : 1,
			"stateStr" : "PRIMARY",
			"uptime" : 372,
			"optime" : {
				"t" : 1335554453000,
				"i" : 1
			},
			"optimeDate" : ISODate("2012-04-27T19:20:53Z"),
			"lastHeartbeat" : ISODate("2012-04-27T19:21:08Z"),
			"pingMs" : 0
		},
		{
			"_id" : 1,
			"name" : "127.0.0.1:27018",
			"health" : 0,
			"state" : 8,
			"stateStr" : "(not reachable/healthy)",
			"uptime" : 0,
			"optime" : {
				"t" : 1335554452000,
				"i" : 1
			},
			"optimeDate" : ISODate("2012-04-27T19:20:52Z"),
			"lastHeartbeat" : ISODate("2012-04-27T19:20:52Z"),
			"pingMs" : 0,
			"errmsg" : "DBClientBase::findN: transport error: 127.0.0.1:27018 query: { replSetHeartbeat: \"test\", v: 108366, pv: 1, checkEmpty: false, from: \"127.0.0.1:27021\" }"
		},
		{
			"_id" : 2,
			"name" : "127.0.0.1:27019",
			"health" : 1,
			"state" : 7,
			"stateStr" : "ARBITER",
			"uptime" : 81329,
			"optime" : {
				"t" : 0,
				"i" : 0
			},
			"optimeDate" : ISODate("1970-01-01T00:00:00Z"),
			"lastHeartbeat" : ISODate("2012-04-27T19:21:06Z"),
			"pingMs" : 0
		},
		{
			"_id" : 3,
			"name" : "127.0.0.1:27020",
			"health" : 1,
			"state" : 2,
			"stateStr" : "SECONDARY",
			"uptime" : 481,
			"optime" : {
				"t" : 1335554453000,
				"i" : 1
			},
			"optimeDate" : ISODate("2012-04-27T19:20:53Z"),
			"lastHeartbeat" : ISODate("2012-04-27T19:21:07Z"),
			"pingMs" : 0
		},
		{
			"_id" : 4,
			"name" : "127.0.0.1:27021",
			"health" : 1,
			"state" : 2,
			"stateStr" : "SECONDARY",
			"optime" : {
				"t" : 1335554453000,
				"i" : 1
			},
			"optimeDate" : ISODate("2012-04-27T19:20:53Z"),
			"self" : true
		}
	],
	"ok" : 1
}



 Comments   
Comment by Eric Milkie [ 08/Aug/12 ]

This works correctly in version 2.2.

Generated at Thu Feb 08 03:09:41 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.