Add guardrails to ensure FCV upgrades/downgrades are finalized

XMLWordPrintableJSON

    • Type: Task
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: 6.0.0, 7.0.0, 8.0.0, 8.2.0-rc0, 8.1.0
    • Component/s: None
    • None
    • Catalog and Routing, Replication
    • None
    • 3
    • TBD
    • 🟩 Routing and Topology
    • None
    • None
    • None
    • None
    • None
    • None

      The current structure of FCV upgrade through setFeatureCompatibilityVersion is as follows:

      1. Start phase
        1. Persist transitional FCV document ("kUpgrading")
      2. Prepare phase
        1. Run prepare steps (_prepareToUpgrade)
      3. Complete phase
        1. Run upgrade steps (_runUpgrade)
        2. Persist final FCV document ("kUpgraded" - enables feature flags)
        3. Run finalize steps (_finalizeUpgrade)

       

      This structure is fragile in that if an stepdown happens after the final FCV document (3.2) is persisted, but before the finalize steps (3.3) are completed, the nodes can't tell that an upgrade was in progress after they restart, because the FCV document is already in its final state.

       

      In this situation users should retry the setFeatureCompatibilityVersion command until they receive an ok: 1 response. This ensures that the upgrade is finalized and thus that all metadata has been upgraded to the new version.

      However there are no technical guardrails to ensure this is done. For example, the user could instead swap the MongoDB binaries to a new version, which could leave mixed metadata long-term.

       

      We should add guardrails that force finalizing the upgrade before any further binary/FCV changes can be made.

       

      Note since v8.2+, FCV downgrades also have this problem.

            Assignee:
            Unassigned
            Reporter:
            Joan Bruguera Micó
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated: