Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-59866

Stop FCV from waiting for majority when currentCommittedSnapshot is dropped

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 4.4.10, 5.0.4, 5.1.0-rc0
    • Affects Version/s: None
    • Component/s: None
    • None
    • Fully Compatible
    • ALL
    • v5.0, v4.4
    • Repl 2021-09-20, Repl 2021-10-04

      To avoid breaking the system during a binary upgrade/downgrade, we make

      {getParameter: 1 featureCompatibilityVersion: 1}

      wait for the FCV change to make it into the stable checkpoint using the waitForMajority mechanism to wait for the currentCommittedSnapshot, which is usually the same as the stable checkpoint.

      These diverge if we do a config change which either changes the writeConcernMajorityJournalDefault, or is a force config which changes the contents of the set. At those times the currentCommittedSnapshot is cleared. This would be inconsequential if it weren't for another bug: configs with split horizons are erroneously determined to be different when they are not. This means that a config change brought about by an election, which is a force config on 4.4, can clear the currentCommittedSnapshot. If we never get a majority write after that point (e.g. because the other nodes were shut down), we will never be able to read the FCV. Unfortunately Cloud Backup has a procedure which commonly triggers this.

      We can fix this by clearing the lastFCVUpdateSnapshot when we dropAllSnapshots (4.4) or clearCommittedSnapshot (5.0) in ReplicationCoordinatorExternImpl.

            Assignee:
            matthew.russotto@mongodb.com Matthew Russotto
            Reporter:
            matthew.russotto@mongodb.com Matthew Russotto
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

              Created:
              Updated:
              Resolved: