Resume setFCV from the point where it got interrupted

XMLWordPrintableJSON

    • Type: Task
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: 8.3.0-rc0
    • Component/s: Upgrade/Downgrade
    • None
    • Catalog and Routing
    • CAR Team 2026-02-16
    • 🟩 Routing and Topology
    • None
    • None
    • None
    • None
    • None
    • None

      Context: setFCV has three phases ("start", "prepare", "complete"). Those were introduced for the sharded clusters setFCV protocol, but for a replica set we can understand setFCV just needs to run all three phases in succession.

       

      Pain point: Currently on a replica set if setFCV is interrupted in e.g. the ‘complete’ phase, running setFCV again re-runs all three phases (i.e. ‘start’, ‘prepare’ and ‘complete’). This requires all steps to be idempotent even if they are re-executed "out-of-order", and when updating the FCV document, we need some careful logic to not roll it back to a previous state. This makes it hard to add the extra phase for Symmetric FCV.

       

      Solution: At the beginning of setFCV, based on the FCV document decide "which phases need to run?", and then only run those phases. We should take inspiration from how the ShardingDDLCoordinators are structured.

            Assignee:
            Joan Bruguera Micó
            Reporter:
            Joan Bruguera Micó
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: