[SERVER-57228] Config Server crashes when updating FCV using an inconsistent FCV document Created: 26/May/21 Updated: 29/Oct/23 Resolved: 28/May/21 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | None |
| Fix Version/s: | 5.0.0-rc1, 5.1.0-rc0 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Antonio Fuschetto | Assignee: | Antonio Fuschetto |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||
| Operating System: | ALL | ||||||||||
| Backport Requested: |
v5.0
|
||||||||||
| Steps To Reproduce: | The steps to reproduce the problem with a sharded cluster from a Mongo Shell connected to the Mongo Router are:
|
||||||||||
| Sprint: | Sharding EMEA 2021-05-31 | ||||||||||
| Participants: | |||||||||||
| Linked BF Score: | 160 | ||||||||||
| Description |
|
When receiving an FCV update request, the Config Server relies on the current document admin.system.version {"_id": "featureCompatibilityVersion"} to determine whether to recover from a previous interrupted run. The Config Server could crash during a FCV upgrade and after an explicit (and illegal) amend of such document. The invalid persisted information leads the Config Server to execute a wrong logic that requires the availability of missing information in that document (that is changeTimestamp), and then hit and invariant. In this scenario, the FCV document is syntactically but not semantically correct. |
| Comments |
| Comment by Vivian Ge (Inactive) [ 06/Oct/21 ] | ||||||||||||||||||
|
Updating the fixversion since branching activities occurred yesterday. This ticket will be in rc0 when it’s been triggered. For more active release information, please keep an eye on #server-release. Thank you! | ||||||||||||||||||
| Comment by Githook User [ 28/May/21 ] | ||||||||||||||||||
|
Author: {'name': 'Antonio Fuschetto', 'email': 'antonio.fuschetto@mongodb.com', 'username': 'afuschetto'}Message: | ||||||||||||||||||
| Comment by Githook User [ 27/May/21 ] | ||||||||||||||||||
|
Author: {'name': 'Antonio Fuschetto', 'email': 'antonio.fuschetto@mongodb.com', 'username': 'afuschetto'}Message: | ||||||||||||||||||
| Comment by Antonio Fuschetto [ 27/May/21 ] | ||||||||||||||||||
|
We had a good range of solutions to the problem, such as the possibility of implementing a consistency check at the time of insertion or modification of the FCV document (there is, the changeTimestamp field must be there during the upgrade/downgrade operation). Nevertheless, considering the various use cases with the consequent possibility of not triggering these checks, I decided to simply resolve the risk of crash replacing the invariant with a user assertion (uassert). See below the new user experience:
|