Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-48776

Remove config version and term check during the reconfig quorum check

    • Type: Icon: Improvement Improvement
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 4.4.1, 4.7.0
    • Affects Version/s: None
    • Component/s: Replication
    • Labels:
      None
    • Fully Compatible
    • v4.4
    • Repl 2020-06-29, Repl 2020-07-13, Repl 2020-07-27
    • 9

      During this step, if we learn that another node has a newer config, we will fail the reconfig command with NewReplicaSetConfigurationIncompatible.

      This extra check seems unnecessary with the safe reconfig protocol.

      The error is also confusing in a concurrent stepdown/reconfig scenario:

      • We have a 5 node replica set, with three voting nodes (node0, node2, and node4)
      • The current config is {version: 22, term: 10}

        and the current primary is node2

      • We step up node0, and it runs for an election in term 11
      • Node2 receives a reconfig command for {version: 23, term: 10}
      • Node2 steps down because it hears of a new term, 11, via a vote request from node2. Note, during stepdown, we do not kill the reconfig command unless we are writing down the config document (which takes a DB X lock).
      • Node0 wins the election (with votes from node2 and node4) and successfully increments the term on step up. The current config is {version: 22, term: 11}
      • Node2 does not install the newer config since it's already in the midst of a reconfig
      • Finally, Node2 fails during its quorum check because Node0 already has a newer config.

      If we remove the quorum check, we will fail later in the protocol here. This is still safe and also returns a more accurate error (NotMaster).

            Assignee:
            pavithra.vetriselvan@mongodb.com Pavithra Vetriselvan
            Reporter:
            pavithra.vetriselvan@mongodb.com Pavithra Vetriselvan
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: