[SERVER-31740] Setting FCV 3.4 on a 3.6 primary kills 3.4 secondaries Created: 26/Oct/17  Updated: 30/Oct/23  Resolved: 10/Nov/17

Status: Closed
Project: Core Server
Component/s: Replication, Storage
Affects Version/s: None
Fix Version/s: 3.6.0-rc4

Type: Bug Priority: Major - P3
Reporter: Tess Avitabile (Inactive) Assignee: Xiangyu Yao (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Backwards Compatibility: Fully Compatible
Operating System: ALL
Steps To Reproduce:

(function() {
    // Start a 1-node replica set with a 3.6 node.
    let rst = ReplSetTest({nodes: [{binVersion: "latest"}]});
    rst.startSet();
 
    // Change catchUpTimeoutMillis to a value 3.4 will understand.
    let replSetConfig = rst.getReplSetConfig();
    replSetConfig.settings = {catchUpTimeoutMillis: 2000};
    rst.initiate(replSetConfig);
 
    // Set FCV to 3.4 so that a 3.4 node can join the set.
    let primary = rst.getPrimary()
    assert.commandWorked(primary.adminCommand({setFeatureCompatibilityVersion: "3.4"}));
 
    // Add a 3.4 node to the set.
    let secondary = rst.add({binVersion: "3.4"});
    rst.reInitiate();
 
    // Ensure the 3.4 node succeeded its initial sync.
    assert.writeOK(primary.getDB("test").coll.insert({awaitRepl: true}, {writeConcern: {w: 2}}));
 
    // Run {setFCV: "3.4"}. This should be idempotent.
    assert.adminCommandWorkedAllowingNetworkError(primary, {setFeatureCompatibilityVersion: "3.4"});
 
    // The 3.4 node crashes due to replicating the targetVersion.
    assert.soon(function() {
        try {
            secondary.adminCommand({ping: 1});
        } catch (e) {
            return true;
        }
        return false;
    });
    rst.stop(secondary, undefined, {allowedExitCode: MongoRunner.EXIT_ABRUPT});
    rst.stopSet();
})();

Sprint: Storage 2017-11-13
Participants:

 Description   

Running {setFeatureCompatibilityVersion: "3.4"} on a 3.6 primary replicates the targetVersion to 3.4 secondaries, causing them to crash. This should not happen if the featureCompatibilityVersion is already 3.4.



 Comments   
Comment by Githook User [ 10/Nov/17 ]

Author:

{'name': 'Xiangyu Yao', 'username': 'xy24', 'email': 'xiangyu.yao@mongodb.com'}

Message: SERVER-31740 Make setFeatureCompatibilityVersion idempotent
Branch: master
https://github.com/mongodb/mongo/commit/07c34da05d049282a41e84282d6755149127e4b7

Comment by Eric Milkie [ 03/Nov/17 ]

Note that due to SERVER-31633, downgrading to fcv 3.4 is now blocked since a majority of nodes need to be 3.6 mongod's for the downgrade to succeed.
To fix this, add another 3.6 node to the replica set in the repro.

Comment by Maria van Keulen [ 31/Oct/17 ]

geert.bosch and I discussed Tess's design and agree that it makes sense; I will proceed with it.

Comment by Eric Milkie [ 27/Oct/17 ]

Maria should be able to get to this next week.
geert.bosch do you concur with Tess's suggested design?

Comment by Tess Avitabile (Inactive) [ 27/Oct/17 ]

After discussion with judah.schvimer, marking this as 3.6 Required and proposing a fix: When the FCV is already equal to the desired FCV, setFeatureCompatibilityVersion should do nothing and return success.

This would not have worked before targetVersion was introduced, since on downgrade, we set the FCV to 3.4, then removed UUIDs, so {setFeatureCompatibilityVersion: 3.4} needed to strip UUIDs even if the FCV was already 3.4. But now that we are using targetVersion, we know that FCV=3.4 means that all downgrade steps are complete, so it is safe to have {setFeatureCompatibilityVersion: "3.4"} do nothing.

If the Storage team does not have the bandwidth for this ticket, Query or Replication may be able to take it.

Comment by Tess Avitabile (Inactive) [ 27/Oct/17 ]

In general, we do not have test coverage of idempotency for setFeatureCompatibilityVersion. It seemed obvious that the command was idempotent in 3.4, but it makes sense to test it now that we do schema change and the targetVersion write as part of the command. I think esha.maharishi has a spreadsheet of legal setFeatureCompatibilityVersion commands for different configurations that should be tested, and this may include idempotency testing.

Comment by Judah Schvimer [ 27/Oct/17 ]

I'm surprised we don't have test coverage for this case. Are there any similar cases we don't have coverage of?

Comment by Tess Avitabile (Inactive) [ 26/Oct/17 ]

geert.bosch I think this is a critical bug introduced by the addition of targetVersion.

Generated at Thu Feb 08 04:28:03 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.