-
Type:
Bug
-
Resolution: Done
-
Priority:
Major - P3
-
Affects Version/s: 2.4.1, 2.4.3
-
Component/s: Replication
-
None
-
Minor Change
-
ALL
-
- Intermittent, have not been able to reproduce the exact states consistently. Never happens on osx but observed under ubuntu 12.0.4
-
None
-
None
-
None
-
None
-
None
-
None
-
None
As one can see from the replSetGetStatus result below 30011 should not be marked as healthy, yet it is. To work around this a health set is not detected only if the health value is 1 the member is in a valid state and errmsg as well as lastHeartbeatMessage is not set. Seems like a logic error somewhere in the command.
errmsg and heartbeatMessage are correct as the replicaset is not in a correct state. The health of 30011 should not be 1.
30011 server is down
{ set: 'replica-set-foo',
date: Thu Apr 25 2013 14:14:28 GMT+0200 (CEST),
myState: 2,
syncingTo: 'localhost:30011',
members:
[ { _id: 0,
name: 'localhost:30010',
health: 1,
state: 2,
stateStr: 'SECONDARY',
uptime: 39,
optime: [Object],
optimeDate: Thu Apr 25 2013 14:14:20 GMT+0200 (CEST),
errmsg: 'db exception in producer: 10278 dbclient error communicating with server: localhost:30011',
self: true },
{ _id: 1,
name: 'localhost:30011',
health: 1,
state: 1,
stateStr: 'PRIMARY',
uptime: 27,
optime: [Object],
optimeDate: Thu Apr 25 2013 14:14:20 GMT+0200 (CEST),
lastHeartbeat: Thu Apr 25 2013 14:14:27 GMT+0200 (CEST),
lastHeartbeatRecv: Thu Jan 01 1970 01:00:00 GMT+0100 (CET),
pingMs: 0,
lastHeartbeatMessage: 'still initializing' },
{ _id: 2,
name: 'localhost:30012',
health: 1,
state: 2,
stateStr: 'SECONDARY',
uptime: 27,
optime: [Object],
optimeDate: Thu Apr 25 2013 14:14:20 GMT+0200 (CEST),
lastHeartbeat: Thu Apr 25 2013 14:14:27 GMT+0200 (CEST),
lastHeartbeatRecv: Thu Jan 01 1970 01:00:00 GMT+0100 (CET),
pingMs: 0,
lastHeartbeatMessage: 'db exception in producer: 10278 dbclient error communicating with server: localhost:30011' },
{ _id: 3,
name: 'localhost:30013',
health: 1,
state: 7,
stateStr: 'ARBITER',
uptime: 27,
lastHeartbeat: Thu Apr 25 2013 14:14:27 GMT+0200 (CEST),
lastHeartbeatRecv: Thu Jan 01 1970 01:00:00 GMT+0100 (CET),
pingMs: 0 } ],
ok: 1 }