Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-41447

Stepdown hook can run out of secondaries to step up

    • ALL
    • Repl 2019-06-17
    • 12

      The hook will iterate over all secondaries and try to step up each one. But if each node fails to get elected, we will run out of nodes to step up because we remove a node from consideration if its election failed. In many suites we set the election timeout to be 24 hours, so if a node resets its election timeout callback it won't run for election before the test times out. If the node that was primary while we were trying to step up one of the secondaries steps down and we exhaust the list of secondaries to step up, then no primary will be elected and the test will time out.

      We recently switched to not step down the former primary before trying to step up a secondary in some suites, which has made this happen more often. Because the former primary hasn't stepped down, it is still accepting writes as nodes are running for election. A node can pass the dry run but fail the election because the primary and another node became fresher than the candidate node in between the dry run and the real election.

            Assignee:
            lingzhi.deng@mongodb.com Lingzhi Deng
            Reporter:
            samy.lanka@mongodb.com Samyukta Lanka
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: