Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-37846

writeConcern can be satisfied with an arbiter if the write was committed

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: 3.4.17, 3.6.8, 4.0.3, 4.1.4
    • Fix Version/s: 3.6.15, 4.0.7, 4.1.8, 3.4.24
    • Component/s: Replication
    • Labels:
    • Backwards Compatibility:
      Fully Compatible
    • Operating System:
      ALL
    • Backport Requested:
      v4.0, v3.6, v3.4
    • Steps To Reproduce:
      Hide

      (function() {
          "use strict";
       
          var rs = new ReplSetTest({name: "reproTest", nodes: 4, waitForKeys: true});
          rs.startSet();
          var nodes = rs.nodeList();
          rs.initiate({
              "_id": "reproTest",
              "members": [
                  {"_id": 0, "host": nodes[0]},
                  {"_id": 1, "host": nodes[1]},
                  {"_id": 2, "host": nodes[2], priority: 0, votes: 0},
                  {"_id": 3, "host": nodes[3], "arbiterOnly": true}
              ]
          });
          var primary = rs.getPrimary();
          var db = primary.getDB('foo');
          var coll = primary.getCollection('foo.bar');
          
          assert.commandWorked(db.coll.insert({a: 1}, {writeConcern: {w: 3, wtimeout: 10000}}));
       
          jsTestLog("first insert worked with all nodes up");
       
          rs.stop(2);
       
          jsTestLog("node shut down");
       
          printjson(rs.status());
       
          jsTestLog("About to do write");
       
          assert.commandFailedWithCode(db.coll.insert({a: 2}, {writeConcern: {w: 3, wtimeout: 10000}}),
                                      ErrorCodes.WriteConcernFailed);
       
          rs.stopSet();
       
      })();
      

      Show
      ( function () { "use strict" ;   var rs = new ReplSetTest({name: "reproTest" , nodes: 4, waitForKeys: true }); rs.startSet(); var nodes = rs.nodeList(); rs.initiate({ "_id" : "reproTest" , "members" : [ { "_id" : 0, "host" : nodes[0]}, { "_id" : 1, "host" : nodes[1]}, { "_id" : 2, "host" : nodes[2], priority: 0, votes: 0}, { "_id" : 3, "host" : nodes[3], "arbiterOnly" : true } ] }); var primary = rs.getPrimary(); var db = primary.getDB( 'foo' ); var coll = primary.getCollection( 'foo.bar' ); assert.commandWorked(db.coll.insert({a: 1}, {writeConcern: {w: 3, wtimeout: 10000}}));   jsTestLog( "first insert worked with all nodes up" );   rs.stop(2);   jsTestLog( "node shut down" );   printjson(rs.status());   jsTestLog( "About to do write" );   assert.commandFailedWithCode(db.coll.insert({a: 2}, {writeConcern: {w: 3, wtimeout: 10000}}), ErrorCodes.WriteConcernFailed);   rs.stopSet();   })();
    • Sprint:
      Repl 2018-12-17, Repl 2019-01-14, Repl 2019-01-28
    • Case:

      Description

      There is an issue when using a PSSA architecture where one node is hidden with 0 votes and 0 priority. It occurs when the node with 0 votes goes down for some reason and the following write is issued:

      db.test.insert({a:1},{writeConcern: {w: 3, wtimeout: 10000}}) 
      

      This is expected to fail because there are not enough data bearing nodes to satisfy the writeConcern.

      The write actually succeeds though:

      WriteResult({ "nInserted" : 1 })
      

      In this architecture, only two nodes are required to receive the write for it to be considered replicated to the majority of nodes (because we only consider nodes with a vote when determining the majority). Once both the primary and secondary apply the write, it will be committed and the arbiter will get sent the new lastCommittedOpTime. To determine if the writeConcern is satisfied, the topology coordinator looks at every node in the replica set to see if enough of them have replicated the write. The topology coordinator also asks the arbiter, which will say its lastAppliedOpTime is the lastCommittedOpTime that it was just sent. So even though the write was replicated on only 2 nodes, the topology coordinator thinks that it was replicated to 3 nodes and says that the writeConcern is satisfied.

        Attachments

          Issue Links

            Activity

              People

              • Votes:
                3 Vote for this issue
                Watchers:
                22 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: