-
Type:
Task
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: 6.0.0, 7.0.0, 8.0.0, 8.2.0-rc0, 8.1.0
-
Component/s: None
-
None
-
Catalog and Routing, Replication
-
None
-
3
-
TBD
-
🟩 Routing and Topology
-
None
-
None
-
None
-
None
-
None
-
None
The current structure of FCV upgrade through setFeatureCompatibilityVersion is as follows:
- Start phase
- Prepare phase
- Complete phase
Â
This structure is fragile in that if an stepdown happens after the final FCV document (3.2) is persisted, but before the finalize steps (3.3) are completed, the nodes can't tell that an upgrade was in progress after they restart, because the FCV document is already in its final state.
Â
In this situation users should retry the setFeatureCompatibilityVersion command until they receive an ok: 1 response. This ensures that the upgrade is finalized and thus that all metadata has been upgraded to the new version.
However there are no technical guardrails to ensure this is done. For example, the user could instead swap the MongoDB binaries to a new version, which could leave mixed metadata long-term.
Â
We should add guardrails that force finalizing the upgrade before any further binary/FCV changes can be made.
Â
Note since v8.2+, FCV downgrades also have this problem.
- is related to
-
SERVER-105971 Investigate if we should take stronger locks during setFCV
-
- Open
-
- related to
-
SERVER-105970 Operations that check feature flags with enable_on_transitional_fcv: true and write to disk can race with setFCV downgrade
-
- Open
-