-
Type:
Bug
-
Resolution: Cannot Reproduce
-
Priority:
Major - P3
-
None
-
Affects Version/s: 3.2.11
-
Component/s: Replication
-
None
-
ALL
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Suppose a replica set has node1:27017(primary), node2:27017(secondary)
case1: when secondary restart,secondary will be in state "could not find member to sync from", it cannot choose a sync source because no data need sync.
{
"set" : "mongo-9555",
"date" : ISODate("2017-01-16T09:35:02.381Z"),
"myState" : 1,
"term" : NumberLong(-1),
"heartbeatIntervalMillis" : NumberLong(2000),
"members" : [
{
"_id" : 0,
"name" : "node1:27017",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"uptime" : 11,
"optime" : Timestamp(1484550472, 2),
"optimeDate" : ISODate("2017-01-16T07:07:52Z"),
"electionTime" : Timestamp(1484559293, 1),
"electionDate" : ISODate("2017-01-16T09:34:53Z"),
"configVersion" : 372974,
"self" : true
},
{
"_id" : 1,
"name" : "node2:27017",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 10,
"optime" : Timestamp(1484550472, 2),
"optimeDate" : ISODate("2017-01-16T07:07:52Z"),
"lastHeartbeat" : ISODate("2017-01-16T09:35:01.585Z"),
"lastHeartbeatRecv" : ISODate("2017-01-16T09:35:01.512Z"),
"pingMs" : NumberLong(0),
"lastHeartbeatMessage" : "could not find member to sync from",
"configVersion" : 372974
}
],
"ok" : 1
}
case2: after write some data to primary, the secondary will choose a sync source successfully.
{
"set" : "mongo-9555",
"date" : ISODate("2017-01-16T09:41:31.490Z"),
"myState" : 1,
"term" : NumberLong(-1),
"heartbeatIntervalMillis" : NumberLong(2000),
"members" : [
{
"_id" : 0,
"name" : "node1:27017",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"uptime" : 400,
"optime" : Timestamp(1484559669, 2),
"optimeDate" : ISODate("2017-01-16T09:41:09Z"),
"electionTime" : Timestamp(1484559293, 1),
"electionDate" : ISODate("2017-01-16T09:34:53Z"),
"configVersion" : 372974,
"self" : true
},
{
"_id" : 1,
"name" : "node2:27017",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 399,
"optime" : Timestamp(1484559669, 2),
"optimeDate" : ISODate("2017-01-16T09:41:09Z"),
"lastHeartbeat" : ISODate("2017-01-16T09:41:29.680Z"),
"lastHeartbeatRecv" : ISODate("2017-01-16T09:41:29.620Z"),
"pingMs" : NumberLong(0),
"lastHeartbeatMessage" : "syncing from node1:27017",
"syncingTo" : "node1:27017",
"configVersion" : 372974
}
],
"ok" : 1
}
rs.remove("node2:27017") behave differently in the above two cases.
case1: node2 transition to REMOVED state, and cannot find a sync source.
case2: node2 transition to REMOVED, but continully to tail oplog from primary.
So what's the expected behavior when a node is removed from replica set?