Secondaries continue synchingTo unresponsive replica set member

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Done
    • Priority: Major - P3
    • None
    • Affects Version/s: 2.0.4
    • Component/s: None
    • None
    • ALL
    • None
    • 3
    • None
    • None
    • None
    • None
    • None
    • None

      To reproduce:

      1. Create replica set with two secondaries (I actually tested this with three, but should work with two)
      2. kill -SIGSTOP the primary
      3. Wait for a new primary to be elected
      4. Run rs.status() on the remaining secondary

      It will show that the old primary is in state:8, but that it is still syncingTo it, e.g.

      SECONDARY> rs.status()
      {
      	"set" : "test",
      	"date" : ISODate("2012-04-27T19:21:08Z"),
      	"myState" : 2,
      	"syncingTo" : "127.0.0.1:27018",
      	"members" : [
      		{
      			"_id" : 0,
      			"name" : "127.0.0.1:27017",
      			"health" : 1,
      			"state" : 1,
      			"stateStr" : "PRIMARY",
      			"uptime" : 372,
      			"optime" : {
      				"t" : 1335554453000,
      				"i" : 1
      			},
      			"optimeDate" : ISODate("2012-04-27T19:20:53Z"),
      			"lastHeartbeat" : ISODate("2012-04-27T19:21:08Z"),
      			"pingMs" : 0
      		},
      		{
      			"_id" : 1,
      			"name" : "127.0.0.1:27018",
      			"health" : 0,
      			"state" : 8,
      			"stateStr" : "(not reachable/healthy)",
      			"uptime" : 0,
      			"optime" : {
      				"t" : 1335554452000,
      				"i" : 1
      			},
      			"optimeDate" : ISODate("2012-04-27T19:20:52Z"),
      			"lastHeartbeat" : ISODate("2012-04-27T19:20:52Z"),
      			"pingMs" : 0,
      			"errmsg" : "DBClientBase::findN: transport error: 127.0.0.1:27018 query: { replSetHeartbeat: \"test\", v: 108366, pv: 1, checkEmpty: false, from: \"127.0.0.1:27021\" }"
      		},
      		{
      			"_id" : 2,
      			"name" : "127.0.0.1:27019",
      			"health" : 1,
      			"state" : 7,
      			"stateStr" : "ARBITER",
      			"uptime" : 81329,
      			"optime" : {
      				"t" : 0,
      				"i" : 0
      			},
      			"optimeDate" : ISODate("1970-01-01T00:00:00Z"),
      			"lastHeartbeat" : ISODate("2012-04-27T19:21:06Z"),
      			"pingMs" : 0
      		},
      		{
      			"_id" : 3,
      			"name" : "127.0.0.1:27020",
      			"health" : 1,
      			"state" : 2,
      			"stateStr" : "SECONDARY",
      			"uptime" : 481,
      			"optime" : {
      				"t" : 1335554453000,
      				"i" : 1
      			},
      			"optimeDate" : ISODate("2012-04-27T19:20:53Z"),
      			"lastHeartbeat" : ISODate("2012-04-27T19:21:07Z"),
      			"pingMs" : 0
      		},
      		{
      			"_id" : 4,
      			"name" : "127.0.0.1:27021",
      			"health" : 1,
      			"state" : 2,
      			"stateStr" : "SECONDARY",
      			"optime" : {
      				"t" : 1335554453000,
      				"i" : 1
      			},
      			"optimeDate" : ISODate("2012-04-27T19:20:53Z"),
      			"self" : true
      		}
      	],
      	"ok" : 1
      }
      

            Assignee:
            Eric Milkie
            Reporter:
            Jeffrey Yemin
            Votes:
            1 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: