[SERVER-59866] Stop FCV from waiting for majority when currentCommittedSnapshot is dropped Created: 09/Sep/21 Updated: 29/Oct/23 Resolved: 23/Sep/21 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 4.4.10, 5.0.4, 5.1.0-rc0 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Matthew Russotto | Assignee: | Matthew Russotto |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||
| Backport Requested: |
v5.0, v4.4
|
||||||||||||||||||||
| Sprint: | Repl 2021-09-20, Repl 2021-10-04 | ||||||||||||||||||||
| Participants: | |||||||||||||||||||||
| Description |
|
To avoid breaking the system during a binary upgrade/downgrade, we make {getParameter: 1 featureCompatibilityVersion: 1}wait for the FCV change to make it into the stable checkpoint using the waitForMajority mechanism to wait for the currentCommittedSnapshot, which is usually the same as the stable checkpoint. These diverge if we do a config change which either changes the writeConcernMajorityJournalDefault, or is a force config which changes the contents of the set. At those times the currentCommittedSnapshot is cleared. This would be inconsequential if it weren't for another bug: configs with split horizons are erroneously determined to be different when they are not. This means that a config change brought about by an election, which is a force config on 4.4, can clear the currentCommittedSnapshot. If we never get a majority write after that point (e.g. because the other nodes were shut down), we will never be able to read the FCV. Unfortunately Cloud Backup has a procedure which commonly triggers this. We can fix this by clearing the lastFCVUpdateSnapshot when we dropAllSnapshots (4.4) or clearCommittedSnapshot (5.0) in ReplicationCoordinatorExternImpl. |
| Comments |
| Comment by Vivian Ge (Inactive) [ 06/Oct/21 ] |
|
Updating the fixversion since branching activities occurred yesterday. This ticket will be in rc0 when it’s been triggered. For more active release information, please keep an eye on #server-release. Thank you! |
| Comment by Githook User [ 05/Oct/21 ] |
|
Author: {'name': 'Matthew Russotto', 'email': 'matthew.russotto@mongodb.com', 'username': 'mtrussotto'}Message: (cherry picked from commit 7e5e6088eaf3ff2e01740cb52efa16c1fb8d360b) |
| Comment by Githook User [ 04/Oct/21 ] |
|
Author: {'name': 'Matthew Russotto', 'email': 'matthew.russotto@mongodb.com', 'username': 'mtrussotto'}Message: (cherry picked from commit 7e5e6088eaf3ff2e01740cb52efa16c1fb8d360b) |
| Comment by Githook User [ 17/Sep/21 ] |
|
Author: {'name': 'Matthew Russotto', 'email': 'matthew.russotto@mongodb.com', 'username': 'mtrussotto'}Message: |