[SERVER-54746] Two primaries in a replica set can satisfy write concern "majority". Created: 24/Feb/21  Updated: 27/Oct/23  Resolved: 01/Mar/21

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Suganthi Mani Assignee: Backlog - Replication Team
Resolution: Works as Designed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-47852 Two primaries can satisfy write conce... Closed
related to SERVER-47363 Define correct use of MemberIds in At... Closed
is related to SERVER-47363 Define correct use of MemberIds in At... Closed
Assigned Teams:
Replication
Operating System: ALL
Steps To Reproduce:

/*
 * This test demonstrates 2 primaries existing in the same replica set and both primaries
 * can satisfy majority write concern.
 *
 * Basically the test simulates below scenario
 * Note: 'P' refers primary, 'S' refers to secondary.
 * 1) [P, S0, S1] // Start a 3 node replica set.
 * 2) Partition A: [P] Partition B: [S0->P, S1] // Create n/w partition A & B.
 * 3) Partition A: [P, S2] Partition B: [P, S1] // Replace S1 by a new node S2 to Partition A using the  member id of S1.
 */
load('jstests/replsets/rslib.js');
(function() {
'use strict';
 
// Start a 3 node replica set.
// [P, S0, S1]
const rst = new ReplSetTest({
    nodes: [{}, {}, {rsConfig: {priority: 0}}],
    useBridge: true
});
 
// Disable Chaining and disable automatic election from happening due to liveness timeout.
var config = rst.getReplSetConfig();
config.settings = config.settings || {};
config.settings["chainingAllowed"] = false;
config.settings["electionTimeoutMillis"] = ReplSetTest.kForeverMillis;
 
rst.startSet();
rst.initiate(config);
 
const dbName = jsTest.name();
const collName = "coll";
 
let primary1 = rst.getPrimary();
const primaryDB = primary1.getDB(dbName);
const primaryColl = primaryDB[collName];
const secondaries = rst.getSecondaries();
 
jsTestLog("Do a document write");
assert.commandWorked(primaryColl.insert({_id: 1, x: 1}, {"writeConcern": {"w": 3}}));
rst.awaitReplication();
 
// Create a n/w partition such that we result in this state [P] [S0, S1].
jsTestLog("Disconnect primary1 from all secondaries");
primary1.disconnect([secondaries[0], secondaries[1]]);
 
jsTestLog("Make secondary0 to be become primary");
assert.commandWorked(secondaries[0].adminCommand({"replSetStepUp": 1}));
 
// Now our network topology will be [P] [S0->P, S1].
jsTestLog("Wait for secondary0 to become master");
checkLog.contains(secondaries[0], "Transition to primary complete");
let primary2 = secondaries[0];
 
 
// Make sure the new writes is able to propagate to the newly added node.
jsTestLog("Do a document write on the primary2");
assert.commandWorked(
    primary2.getDB(dbName)[collName].insert({_id: 2, x: 2}, {"writeConcern": {"w": "majority"}}));
 
 
jsTestLog("Adding a new voting node to the replica set");
let origConfig = rst.getReplSetConfigFromNode(primary1.nodeId);
const node4 = rst.add({
    rsConfig: {priority: 0, votes: 1},
    setParameter: {
        'numInitialSyncAttempts': 1
    }
});
 
// Simulate this network topology [P, S2] [P, S1].
node4.disconnect([secondaries[0], secondaries[1]]);
 
// Run a reconfig command on the primary1 to add node 4
var newConfig = rst.getReplSetConfig();
// Only reset members.
origConfig.members[2].host = newConfig.members[3].host;
origConfig.version += 1;
assert.adminCommandWorkedAllowingNetworkError(
    primary1, {replSetReconfig: origConfig, maxTimeMS: ReplSetTest.kDefaultTimeoutMS});
 
jsTestLog(
    "Do some document writes to verify we have 2 primaries and both satisfy write concern majority");
assert.commandWorked(primary1.getDB(dbName)[collName].insert({_id: 3, x: "primary1 Doc"},
                                                             {"writeConcern": {"w": "majority"}}));
assert.commandWorked(primary2.getDB(dbName)[collName].insert({_id: 6, x: "primary2 Doc"},
                                                             {"writeConcern": {"w": "majority"}}));
 
jsTestLog("Verify our primary1 can be get re-elected.");
assert.commandWorked(primary1.adminCommand({"replSetStepDown": 1000, "force": true}));
assert.commandWorked(primary1.adminCommand({replSetFreeze: 0}));
assert.commandWorked(primary1.adminCommand({"replSetStepUp": 1}));
 
jsTestLog("Test completed");
rst.stopSet();
}());

Participants:

 Description   

This is a just a side-effect of allowing a replica set member to be replaced by a new node (having different host address) with the member id same as replaced node's member id.

Code snippet from validateOldAndNewConfigsCompatible() (that is called during reconfig procedure)

    // For every member config mNew in newConfig, if there exists member config mOld
    // in oldConfig such that mNew.getHostAndPort() == mOld.getHostAndPort(), it is required
    // that mNew.getId() == mOld.getId().

Basically when member ids are equal, it is not required that their host name matches.



 Comments   
Comment by Steven Vannelli [ 01/Mar/21 ]

We are going to close this as Works as Designed since this behavior/expectation will be documented in DOCS-13745.

Comment by Lingzhi Deng [ 01/Mar/21 ]

I think we allow this because we need to support changing hostnames in the config. In the case posted above, I don't think the server can tell if S1 was moved to a different host or if S2 was indeed a new node. So my understanding is that this is more like a misuse of reconfig and member id.

Comment by Suganthi Mani [ 24/Feb/21 ]

CC lingzhi.deng Not sure whether it's a valid case but thought of bringing this to the replication team attention.

Generated at Thu Feb 08 05:34:22 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.