Uploaded image for project: 'Documentation'
  1. Documentation
  2. DOCS-13582

Investigate changes in SERVER-47331: Rethink the transition from force reconfig to safe reconfig

    XMLWordPrintable

Details

    • Task
    • Status: Closed
    • Major - P3
    • Resolution: Duplicate
    • None
    • 4.7.0
    • manual, Server
    • None

    Description

      Description

      Downstream Change Summary

      Transition from force reconfig to safe reconfig:
      The safety of a new non-force reconfig is not guaranteed until the current config is installed by a non-force reconfig and committed.

      Description of Linked Ticket

      When the current config C0 is installed by a "force" reconfig, the next non-force reconfig with config C1 doesn't prevent config divergence if
      1. Reconfig C1 has not propagated to a majority of nodes.
      2. A failover happens
      3. A new reconfig with a different config C2 runs on the new primary.
      4. C1 and C2 propagate to disjoint nodes.

      The diverged configs may lead to two primaries elected in the same term until C2 (with a higher config term) propagates to a majority of C1. A similar issue is shown in SERVER-47119 with a detailed trace.

      In Initial Sync Semantics project, we will give new nodes votes: 0 and run automatic reconfig afterwards to grant them votes afterwards. The config to add the node will face the unsafe but rare case mentioned above. Once the first reconfig passes the aforementioned unsafe period and becomes committed, the following automatic reconfigs will be safe.

      To avoid the unsafe case, one idea is to run an automatic reconfig after a force reconfig by increasing the config version and giving it a config term. After this automatic reconfig, following reconfigs will be safe. However, when users run "force" reconfig, it's likely the replset is not stable so that they are willing to risk the loss of committed data. It may not be the right time to run such an automatic reconfig.

      Even worse, the automatic reconfig may interrupt the propagation of the "force" reconfig. For example, assuming the current config C0 has 5 nodes, a force reconfig C1 runs on a secondary to convert that secondary to a single node replica set. The force reconfig C1 will increase the version but remove the config term, then propagate to other nodes on their next heartbeats. Nodes in C0 will become REMOVED after learning C1. However, if an automatic reconfig C2 happens on the single node replset, since C2 has a term, C2's term has to be higher than C0 to propagate, which may not be the case if another election occurs in C0. As a result, C2 may not be able to propagate to nodes still in C0. If their terms are the same, nodes in C0 will have a diverged config. They'll be alive and keep running heartbeats to the single node replset. When either of C0 or C2 has a higher term, it will be propagated to the other, potentially overriding the force reconfig.

      Scope of changes

      Impact to Other Docs

      MVP (Work and Date)

      Resources (Scope or Design Docs, Invision, etc.)

      Attachments

        Issue Links

          Activity

            People

              jeffrey.allen@mongodb.com Jeffrey Allen
              backlog-server-pm Backlog - Core Eng Program Management Team
              Jeffrey Allen Jeffrey Allen
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:
                1 year, 15 weeks, 6 days ago