Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-9460

replSetGetStatus seems to report health wrong

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • 2.7.8
    • Affects Version/s: 2.4.1, 2.4.3
    • Component/s: Replication
    • Labels:
      None
    • Minor Change
    • ALL
      • Intermittent, have not been able to reproduce the exact states consistently. Never happens on osx but observed under ubuntu 12.0.4

      As one can see from the replSetGetStatus result below 30011 should not be marked as healthy, yet it is. To work around this a health set is not detected only if the health value is 1 the member is in a valid state and errmsg as well as lastHeartbeatMessage is not set. Seems like a logic error somewhere in the command.

      errmsg and heartbeatMessage are correct as the replicaset is not in a correct state. The health of 30011 should not be 1.

      30011 server is down

      { set: 'replica-set-foo',
        date: Thu Apr 25 2013 14:14:28 GMT+0200 (CEST),
        myState: 2,
        syncingTo: 'localhost:30011',
        members: 
         [ { _id: 0,
             name: 'localhost:30010',
             health: 1,
             state: 2,
             stateStr: 'SECONDARY',
             uptime: 39,
             optime: [Object],
             optimeDate: Thu Apr 25 2013 14:14:20 GMT+0200 (CEST),
             errmsg: 'db exception in producer: 10278 dbclient error communicating with server: localhost:30011',
             self: true },
           { _id: 1,
             name: 'localhost:30011',
             health: 1,
             state: 1,
             stateStr: 'PRIMARY',
             uptime: 27,
             optime: [Object],
             optimeDate: Thu Apr 25 2013 14:14:20 GMT+0200 (CEST),
             lastHeartbeat: Thu Apr 25 2013 14:14:27 GMT+0200 (CEST),
             lastHeartbeatRecv: Thu Jan 01 1970 01:00:00 GMT+0100 (CET),
             pingMs: 0,
             lastHeartbeatMessage: 'still initializing' },
           { _id: 2,
             name: 'localhost:30012',
             health: 1,
             state: 2,
             stateStr: 'SECONDARY',
             uptime: 27,
             optime: [Object],
             optimeDate: Thu Apr 25 2013 14:14:20 GMT+0200 (CEST),
             lastHeartbeat: Thu Apr 25 2013 14:14:27 GMT+0200 (CEST),
             lastHeartbeatRecv: Thu Jan 01 1970 01:00:00 GMT+0100 (CET),
             pingMs: 0,
             lastHeartbeatMessage: 'db exception in producer: 10278 dbclient error communicating with server: localhost:30011' },
           { _id: 3,
             name: 'localhost:30013',
             health: 1,
             state: 7,
             stateStr: 'ARBITER',
             uptime: 27,
             lastHeartbeat: Thu Apr 25 2013 14:14:27 GMT+0200 (CEST),
             lastHeartbeatRecv: Thu Jan 01 1970 01:00:00 GMT+0100 (CET),
             pingMs: 0 } ],
        ok: 1 }
      
      

            Assignee:
            Unassigned Unassigned
            Reporter:
            christkv Christian Amor Kvalheim
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

              Created:
              Updated:
              Resolved: