[SERVER-59866] Stop FCV from waiting for majority when currentCommittedSnapshot is dropped Created: 09/Sep/21  Updated: 29/Oct/23  Resolved: 23/Sep/21

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 4.4.10, 5.0.4, 5.1.0-rc0

Type: Bug Priority: Major - P3
Reporter: Matthew Russotto Assignee: Matthew Russotto
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Problem/Incident
Related
related to SERVER-59867 Split horizon mappings in ReplSetConf... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v5.0, v4.4
Sprint: Repl 2021-09-20, Repl 2021-10-04
Participants:

 Description   

To avoid breaking the system during a binary upgrade/downgrade, we make

{getParameter: 1 featureCompatibilityVersion: 1}

wait for the FCV change to make it into the stable checkpoint using the waitForMajority mechanism to wait for the currentCommittedSnapshot, which is usually the same as the stable checkpoint.

These diverge if we do a config change which either changes the writeConcernMajorityJournalDefault, or is a force config which changes the contents of the set. At those times the currentCommittedSnapshot is cleared. This would be inconsequential if it weren't for another bug: configs with split horizons are erroneously determined to be different when they are not. This means that a config change brought about by an election, which is a force config on 4.4, can clear the currentCommittedSnapshot. If we never get a majority write after that point (e.g. because the other nodes were shut down), we will never be able to read the FCV. Unfortunately Cloud Backup has a procedure which commonly triggers this.

We can fix this by clearing the lastFCVUpdateSnapshot when we dropAllSnapshots (4.4) or clearCommittedSnapshot (5.0) in ReplicationCoordinatorExternImpl.



 Comments   
Comment by Vivian Ge (Inactive) [ 06/Oct/21 ]

Updating the fixversion since branching activities occurred yesterday. This ticket will be in rc0 when it’s been triggered. For more active release information, please keep an eye on #server-release. Thank you!

Comment by Githook User [ 05/Oct/21 ]

Author:

{'name': 'Matthew Russotto', 'email': 'matthew.russotto@mongodb.com', 'username': 'mtrussotto'}

Message: SERVER-59866 Stop FCV from waiting for majority when currentCommitted…

(cherry picked from commit 7e5e6088eaf3ff2e01740cb52efa16c1fb8d360b)
Branch: v4.4
https://github.com/mongodb/mongo/commit/b3cc5b18bda3cd24465126f1603fa54b2ea7926e

Comment by Githook User [ 04/Oct/21 ]

Author:

{'name': 'Matthew Russotto', 'email': 'matthew.russotto@mongodb.com', 'username': 'mtrussotto'}

Message: SERVER-59866 Stop FCV from waiting for majority when currentCommitted…

(cherry picked from commit 7e5e6088eaf3ff2e01740cb52efa16c1fb8d360b)
Branch: v5.0
https://github.com/mongodb/mongo/commit/bda9de9adfb1b290b59d9787c24d1751a1d3883c

Comment by Githook User [ 17/Sep/21 ]

Author:

{'name': 'Matthew Russotto', 'email': 'matthew.russotto@mongodb.com', 'username': 'mtrussotto'}

Message: SERVER-59866 Stop FCV from waiting for majority when currentCommitted…
Branch: master
https://github.com/mongodb/mongo/commit/7e5e6088eaf3ff2e01740cb52efa16c1fb8d360b

Generated at Thu Feb 08 05:48:21 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.