Failure to elect new primary when primary host goes away

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Done
    • Priority: Major - P3
    • None
    • Affects Version/s: 3.2.9
    • Component/s: None
    • None
    • ALL
    • Hide

      Bring up a 3 node replicaset and wait for the nodes to settle.
      Kill the machine hosting the primary.
      Secondaries don't elect a new primary even if left for over an hour.

      The problem doesn't happen every time but seems to be more likely if the primary is killed soon after the secondaries have joined the replicaset and synced.

      I am able to manually recover the replicaset by adjusting member priorities using rs.reconfig() but I would have expected for mongodb to be able to recover from this scenario itself.

      Show
      Bring up a 3 node replicaset and wait for the nodes to settle. Kill the machine hosting the primary. Secondaries don't elect a new primary even if left for over an hour. The problem doesn't happen every time but seems to be more likely if the primary is killed soon after the secondaries have joined the replicaset and synced. I am able to manually recover the replicaset by adjusting member priorities using rs.reconfig() but I would have expected for mongodb to be able to recover from this scenario itself.
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      I am able to regularly trigger a situation where when the machine hosting the primary node in a replicaset goes away, the remaining nodes fail to elect a new primary and stay as secondaries.

      The attached files are:

      1.log - mongod logs for the primary up until the machine was stopped.
      2.log, 3.log - complete mongod logs for the secondaries.
      rs.config.txt - the output for rs.config once the replicaset was configured.
      rs.status-before.txt - the replicaset status before the primary machine was taken down.
      rs.status-after.txt - the replicaset status some time after the primary machine was taken down.

        1. 1.log
          49 kB
          Menno Finlay-Smits
        2. 2.log
          140 kB
          Menno Finlay-Smits
        3. 3.log
          298 kB
          Menno Finlay-Smits
        4. rs.config.txt
          1 kB
          Menno Finlay-Smits
        5. rs.status-after.txt
          1 kB
          Menno Finlay-Smits
        6. rs.status-before.txt
          2 kB
          Menno Finlay-Smits

            Assignee:
            Unassigned
            Reporter:
            Menno Finlay-Smits
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

              Created:
              Updated:
              Resolved: