-
Type:
Bug
-
Resolution: Done
-
Priority:
Major - P3
-
Affects Version/s: 2.4.1, 2.4.3
-
Component/s: Replication
-
None
-
Minor Change
-
ALL
-
- Intermittent, have not been able to reproduce the exact states consistently. Never happens on osx but observed under ubuntu 12.0.4
-
None
-
0
-
None
-
None
-
None
-
None
-
None
-
None
As one can see from the replSetGetStatus result below 30011 should not be marked as healthy, yet it is. To work around this a health set is not detected only if the health value is 1 the member is in a valid state and errmsg as well as lastHeartbeatMessage is not set. Seems like a logic error somewhere in the command.
errmsg and heartbeatMessage are correct as the replicaset is not in a correct state. The health of 30011 should not be 1.
30011 server is down
{ set: 'replica-set-foo', date: Thu Apr 25 2013 14:14:28 GMT+0200 (CEST), myState: 2, syncingTo: 'localhost:30011', members: [ { _id: 0, name: 'localhost:30010', health: 1, state: 2, stateStr: 'SECONDARY', uptime: 39, optime: [Object], optimeDate: Thu Apr 25 2013 14:14:20 GMT+0200 (CEST), errmsg: 'db exception in producer: 10278 dbclient error communicating with server: localhost:30011', self: true }, { _id: 1, name: 'localhost:30011', health: 1, state: 1, stateStr: 'PRIMARY', uptime: 27, optime: [Object], optimeDate: Thu Apr 25 2013 14:14:20 GMT+0200 (CEST), lastHeartbeat: Thu Apr 25 2013 14:14:27 GMT+0200 (CEST), lastHeartbeatRecv: Thu Jan 01 1970 01:00:00 GMT+0100 (CET), pingMs: 0, lastHeartbeatMessage: 'still initializing' }, { _id: 2, name: 'localhost:30012', health: 1, state: 2, stateStr: 'SECONDARY', uptime: 27, optime: [Object], optimeDate: Thu Apr 25 2013 14:14:20 GMT+0200 (CEST), lastHeartbeat: Thu Apr 25 2013 14:14:27 GMT+0200 (CEST), lastHeartbeatRecv: Thu Jan 01 1970 01:00:00 GMT+0100 (CET), pingMs: 0, lastHeartbeatMessage: 'db exception in producer: 10278 dbclient error communicating with server: localhost:30011' }, { _id: 3, name: 'localhost:30013', health: 1, state: 7, stateStr: 'ARBITER', uptime: 27, lastHeartbeat: Thu Apr 25 2013 14:14:27 GMT+0200 (CEST), lastHeartbeatRecv: Thu Jan 01 1970 01:00:00 GMT+0100 (CET), pingMs: 0 } ], ok: 1 }