[SERVER-37846] writeConcern can be satisfied with an arbiter if the write was committed Created: 31/Oct/18  Updated: 29/Oct/23  Resolved: 24/Jan/19

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: 3.4.17, 3.6.8, 4.0.3, 4.1.4
Fix Version/s: 3.6.15, 4.0.7, 4.1.8, 3.4.24

Type: Bug Priority: Major - P3
Reporter: Samyukta Lanka Assignee: Vesselina Ratcheva (Inactive)
Resolution: Fixed Votes: 3
Labels: neweng
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File repro.js    
Issue Links:
Backports
Problem/Incident
causes SERVER-40355 rs.config that contains an _id greate... Closed
Related
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.0, v3.6, v3.4
Steps To Reproduce:

(function() {
    "use strict";
 
    var rs = new ReplSetTest({name: "reproTest", nodes: 4, waitForKeys: true});
    rs.startSet();
    var nodes = rs.nodeList();
    rs.initiate({
        "_id": "reproTest",
        "members": [
            {"_id": 0, "host": nodes[0]},
            {"_id": 1, "host": nodes[1]},
            {"_id": 2, "host": nodes[2], priority: 0, votes: 0},
            {"_id": 3, "host": nodes[3], "arbiterOnly": true}
        ]
    });
    var primary = rs.getPrimary();
    var db = primary.getDB('foo');
    var coll = primary.getCollection('foo.bar');
    
    assert.commandWorked(db.coll.insert({a: 1}, {writeConcern: {w: 3, wtimeout: 10000}}));
 
    jsTestLog("first insert worked with all nodes up");
 
    rs.stop(2);
 
    jsTestLog("node shut down");
 
    printjson(rs.status());
 
    jsTestLog("About to do write");
 
    assert.commandFailedWithCode(db.coll.insert({a: 2}, {writeConcern: {w: 3, wtimeout: 10000}}),
                                ErrorCodes.WriteConcernFailed);
 
    rs.stopSet();
 
})();

Sprint: Repl 2018-12-17, Repl 2019-01-14, Repl 2019-01-28
Participants:
Case:

 Description   

There is an issue when using a PSSA architecture where one node is hidden with 0 votes and 0 priority. It occurs when the node with 0 votes goes down for some reason and the following write is issued:

db.test.insert({a:1},{writeConcern: {w: 3, wtimeout: 10000}}) 

This is expected to fail because there are not enough data bearing nodes to satisfy the writeConcern.

The write actually succeeds though:

WriteResult({ "nInserted" : 1 })

In this architecture, only two nodes are required to receive the write for it to be considered replicated to the majority of nodes (because we only consider nodes with a vote when determining the majority). Once both the primary and secondary apply the write, it will be committed and the arbiter will get sent the new lastCommittedOpTime. To determine if the writeConcern is satisfied, the topology coordinator looks at every node in the replica set to see if enough of them have replicated the write. The topology coordinator also asks the arbiter, which will say its lastAppliedOpTime is the lastCommittedOpTime that it was just sent. So even though the write was replicated on only 2 nodes, the topology coordinator thinks that it was replicated to 3 nodes and says that the writeConcern is satisfied.



 Comments   
Comment by Githook User [ 24/Sep/19 ]

Author:

{'username': 'vessy-mongodb', 'email': 'vesselina.ratcheva@mongodb.com', 'name': 'Vesselina Ratcheva'}

Message: SERVER-37846 Disallow using arbiters to satisfy numeric write concern when writes commit
SERVER-40355 Handle RS config with _id larger than set size

(cherry picked from commit b023cfd4db379092f7642dd825d79652d905f847)
(cherry picked from commit 109129eb5f46419e852b65eb35f935194d17fd5d)
Branch: v3.6
https://github.com/mongodb/mongo/commit/82a4c718fb257d28bdf2aed92ad3d94de9b52f6f

Comment by Githook User [ 24/Sep/19 ]

Author:

{'name': 'Vesselina Ratcheva', 'username': 'vessy-mongodb', 'email': 'vesselina.ratcheva@mongodb.com'}

Message: SERVER-37846 Disallow using arbiters to satisfy numeric write concern when writes commit
SERVER-40355 Handle RS config with _id larger than set size

(cherry picked from commit b023cfd4db379092f7642dd825d79652d905f847)
(cherry picked from commit 109129eb5f46419e852b65eb35f935194d17fd5d)
Branch: v3.4
https://github.com/mongodb/mongo/commit/c6cef9e05219b02e7db1696617b4a44f0c6bed67

Comment by Githook User [ 27/Feb/19 ]

Author:

{'name': 'Vesselina Ratcheva', 'username': 'vessy-mongodb', 'email': 'vesselina.ratcheva@10gen.com'}

Message: SERVER-37846 Disallow using arbiters to satisfy numeric write concern when writes commit

(cherry picked from commit b023cfd4db379092f7642dd825d79652d905f847)
Branch: v4.0
https://github.com/mongodb/mongo/commit/5b6ae4ca09c36175186c7c0028758b9d9cdfc93e

Comment by Githook User [ 24/Jan/19 ]

Author:

{'username': 'vessy-mongodb', 'email': 'vesselina.ratcheva@10gen.com', 'name': 'Vesselina Ratcheva'}

Message: SERVER-37846 Disallow using arbiters to satisfy numeric write concern when writes commit
Branch: master
https://github.com/mongodb/mongo/commit/b023cfd4db379092f7642dd825d79652d905f847

Comment by Samyukta Lanka [ 31/Oct/18 ]

One potential solution is to ignore arbiters here when determining if enough nodes have replicated the write.

Generated at Thu Feb 08 04:47:12 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.