-
Type:
Bug
-
Resolution: Fixed
-
Priority:
Major - P3
-
Affects Version/s: 9.0.0-rc0
-
Component/s: Sharding, Upgrade/Downgrade
-
Catalog and Routing
-
Fully Compatible
-
ALL
-
CAR Team 2026-06-22
-
200
-
None
-
None
-
None
-
None
-
None
-
None
-
None
As part of Authoritative Shards, setFCV clones the authoritative DB/collection metadata from the config server to the shards. This is done by having the config server spawn the cloning DDL coordinator on the shards during its kUpgrading transitional FCV.
The expectation is that the recipient shards are also in kUpgrading. This happens during the "happy path" however there are some edge cases where this does not happen.
Edge case 1: Retrying a setFCV that got interrupted after shads got sent to FCV 9.0 (UPGRADED).
- All nodes are on FCV 8.0, user starts a FCV upgrade to 9.0.
- During the kPrepare phase (all configsvr+shardsvrs on kUpgrading FCV), we clone the authoritative metadata from the configsvr to the shards.
- The config server enters the kComplete phase and sends the shards to FCV 9.0 (UPGRADED).
- However right before the config server goes to FCV 9.0 (UPGRADED), it steps down.
- The setFCV upgrade to 9.0 is retried. This re-executes all the (kStart, kPrepare, kComplete) phases.
- During the re-execution of kPrepare phase, we re-send the clone authoritative metadata to the shards, despite the shards already being in FCV 9.0 (UPGRADED).
This edge case can not happen with Symmetric FCV (since , in steps 5-6 we resume the upgrade from the kComplete phase without re-executing kPrepare).SERVER-119476
Edge case 2: Config-server only "downgrading to upgrading" FCV transition
- All nodes are on FCV 9.0, setFCV starts a FCV downgrade to 8.0.
- During the kStart phase, the Config Server sets its FCV to kDowngrading and then fails right after, before it could send any of the shards to kDowngrading.
- The user then decides that he wants to re-upgrade to FCV 9.0.
- The config server will then set its FCV to kUpgrading and re-execute all the (kStart, kPrepare, kComplete) phases. The shards will do nothing since they are already on FCV 9.0 (UPGRADED).
- During the kPrepare phase, we re-send the clone authoritative metadata to the shards, despite the shards already being in FCV 9.0 (UPGRADED).
This edge case can still happen with Symmetric FCV.
Fix: By idempotency, shards should do nothing if they receive a cloning DDL request when already on FCV 9.0, since they are already authoritative.
- is related to
-
SERVER-119476 Resume setFCV from the point where it got interrupted
-
- Closed
-
- related to
-
SERVER-129028 Internal command auth jstest should only run _shardsvrCloneAuthoritativeMetadata with Authoritative Shards
-
- Closed
-