Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-9460

replSetGetStatus seems to report health wrong

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • 2.7.8
    • Affects Version/s: 2.4.1, 2.4.3
    • Component/s: Replication
    • None
    • Minor Change
    • ALL
      • Intermittent, have not been able to reproduce the exact states consistently. Never happens on osx but observed under ubuntu 12.0.4
    • None
    • 0
    • None
    • None
    • None
    • None
    • None
    • None

      As one can see from the replSetGetStatus result below 30011 should not be marked as healthy, yet it is. To work around this a health set is not detected only if the health value is 1 the member is in a valid state and errmsg as well as lastHeartbeatMessage is not set. Seems like a logic error somewhere in the command.

      errmsg and heartbeatMessage are correct as the replicaset is not in a correct state. The health of 30011 should not be 1.

      30011 server is down

      { set: 'replica-set-foo',
        date: Thu Apr 25 2013 14:14:28 GMT+0200 (CEST),
        myState: 2,
        syncingTo: 'localhost:30011',
        members: 
         [ { _id: 0,
             name: 'localhost:30010',
             health: 1,
             state: 2,
             stateStr: 'SECONDARY',
             uptime: 39,
             optime: [Object],
             optimeDate: Thu Apr 25 2013 14:14:20 GMT+0200 (CEST),
             errmsg: 'db exception in producer: 10278 dbclient error communicating with server: localhost:30011',
             self: true },
           { _id: 1,
             name: 'localhost:30011',
             health: 1,
             state: 1,
             stateStr: 'PRIMARY',
             uptime: 27,
             optime: [Object],
             optimeDate: Thu Apr 25 2013 14:14:20 GMT+0200 (CEST),
             lastHeartbeat: Thu Apr 25 2013 14:14:27 GMT+0200 (CEST),
             lastHeartbeatRecv: Thu Jan 01 1970 01:00:00 GMT+0100 (CET),
             pingMs: 0,
             lastHeartbeatMessage: 'still initializing' },
           { _id: 2,
             name: 'localhost:30012',
             health: 1,
             state: 2,
             stateStr: 'SECONDARY',
             uptime: 27,
             optime: [Object],
             optimeDate: Thu Apr 25 2013 14:14:20 GMT+0200 (CEST),
             lastHeartbeat: Thu Apr 25 2013 14:14:27 GMT+0200 (CEST),
             lastHeartbeatRecv: Thu Jan 01 1970 01:00:00 GMT+0100 (CET),
             pingMs: 0,
             lastHeartbeatMessage: 'db exception in producer: 10278 dbclient error communicating with server: localhost:30011' },
           { _id: 3,
             name: 'localhost:30013',
             health: 1,
             state: 7,
             stateStr: 'ARBITER',
             uptime: 27,
             lastHeartbeat: Thu Apr 25 2013 14:14:27 GMT+0200 (CEST),
             lastHeartbeatRecv: Thu Jan 01 1970 01:00:00 GMT+0100 (CET),
             pingMs: 0 } ],
        ok: 1 }
      
      

            Assignee:
            Unassigned Unassigned
            Reporter:
            christkv Christian Amor Kvalheim
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

              Created:
              Updated:
              Resolved: