[SERVER-53894] setFCV transitions to fully upgraded/downgraded too early on command retry Created: 19/Jan/21  Updated: 29/Oct/23  Resolved: 05/Feb/21

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 4.9.0
Fix Version/s: 4.9.0

Type: Bug Priority: Major - P3
Reporter: Jason Chan Assignee: Ali Mir
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
is depended on by SERVER-52349 Enable feature flag for Remove the Ne... Closed
Related
is related to SERVER-50423 Change memberConfig's slaveDelay fiel... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Repl 2021-02-08
Participants:

 Description   

Currently, when a setFCV command fails, we can expect the FCV to end up in the intermediary "upgrading"/"downgrading" states. This should be safe because we expect the setFCV command to be idempotent and a user can simply call the setFCV command again to complete the upgrade.

In SERVER-51474, we refactored and simplified a lot of the FCV code. This included adding an upgradeFeatureCompatibilityVersionDocument function that will update the FCV document to the next version if the requested version is a viable transition and is different from the current version. The setFCV command will call this twice – once as we expect to transition from downgraded -> upgrading, and then another time to transition from upgrading -> upgraded

This is problematic because we often add upgrade/downgrade logic in the middle of a setFCV call. Examples are when we have to do a reconfig after transitioning to the "upgrading" state in the safe reconfig project and as part of SERVER-50423. The following scenario will have upgrade/downgrade concerns:
1. Call setFCV(upgradeVersion). upgradeFeatureCompatibilityVersionDocument(upgradeVersion) is called and sets FCV to upgrading. setFCV fails and returns an error before it can complete.
2. Call setFCV(upgrade) again. upgradeFeatureCompatibilityVersionDocument(upgradeVersion) now transitions from upgrading to upgraded. Node fails again before setFCV completes.
3. Node is now fully upgraded but never completes the additional upgrade/downgrade behavior as part of the setFCV command.

Ultimately, we do not want to ever enter the fully upgraded/downgraded FCV until the command has succeeded (and all upgrade/downgrade behavior is performed).



 Comments   
Comment by Githook User [ 05/Feb/21 ]

Author:

{'name': 'Ali Mir', 'email': 'ali.mir@mongodb.com', 'username': 'ali-mir'}

Message: SERVER-53894 Ensure updateFeatureCompatibilityVersionDocument() is idempotent
Branch: master
https://github.com/mongodb/mongo/commit/23d20aefc26f110da7ffee7c9b7ba1c33751b538

Comment by Jason Chan [ 19/Jan/21 ]

For additional context, on v4.4 before the refactor, setFCV will call the setTargetUpgrade function which always sets the version to "upgrading". Therefore, we will never fully transition to fully upgraded/fully downgraded before running all our required upgrade/downgrade behavior.

Generated at Thu Feb 08 05:32:07 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.