Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-57772

Failpoints on mongos rewrite state change error codes in writeConcernError

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 5.0.6, 5.1.0-rc0
    • Affects Version/s: None
    • Component/s: None
    • None
    • Fully Compatible
    • ALL
    • v5.0, v4.4, v4.2, v4.0
    • Hide

      Start a sharded cluster version 5.0.0-alpha0-856-gf4e7955.

      Using the shell, configure a failpoint on the "insert" command using an error code that represents a server state change.

      var code = 91; // ShutdownInProgress
      var cmd = {
          configureFailPoint: "failCommand",
          mode: {times: 1},
          data: {
              failCommands: ["insert"],
              writeConcernError: {code: code, errmsg: "Replication is being shut down"}
          }
      };
      db.adminCommand(cmd);
      db.runCommand({insert: "coldb.runCommand({insert: "coll", documents: [{x:1}]});
      

      Results in:

      {
              "n" : 1,
              "writeConcernError" : {
                      "code" : 6,
                      "errmsg" : "Replication is being shut down"
              },
              "ok" : 1,
              "$clusterTime" : {
                      "clusterTime" : Timestamp(1623887155, 1),
                      "signature" : {
                              "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
                              "keyId" : NumberLong(0)
                      }
              },
              "operationTime" : Timestamp(1623887155, 1)
      }
      

      The error code 91 was rewritten to 6 (HostUnreachable).

      Show
      Start a sharded cluster version 5.0.0-alpha0-856-gf4e7955. Using the shell, configure a failpoint on the "insert" command using an error code that represents a server state change. var code = 91; // ShutdownInProgress var cmd = { configureFailPoint: "failCommand" , mode: {times: 1}, data: { failCommands: [ "insert" ], writeConcernError: {code: code, errmsg: "Replication is being shut down" } } }; db.adminCommand(cmd); db.runCommand({insert: "coldb.runCommand({insert: " coll", documents: [{x:1}]}); Results in: { "n" : 1, "writeConcernError" : { "code" : 6, "errmsg" : "Replication is being shut down" }, "ok" : 1, "$clusterTime" : { "clusterTime" : Timestamp(1623887155, 1), "signature" : { "hash" : BinData(0, "AAAAAAAAAAAAAAAAAAAAAAAAAAA=" ), "keyId" : NumberLong(0) } }, "operationTime" : Timestamp(1623887155, 1) } The error code 91 was rewritten to 6 (HostUnreachable).
    • Service Arch 2021-06-28, Service Arch 2021-07-12

      Setting a failpoint on mongos to return a writeConcernError containing a state change error code is rewritten to HostUnreachable (6). I think this is caused by the changes of SERVER-50549.

      State change errors indicate the server has changed state (e.g. 91 = ShutdownInProgress or 10107=NotWritablePrimary). Drivers document the state change errors they check for in the Server Discovery and Monitoring specification.

      Drivers expect failpoints on mongos to return the errors exactly as they are configured. This enables the test scenario of a mongos returning a state change error itself (instead of rewriting one from a backing mongod).

      The writeConcernError in particular only affects one test in the C driver, and is easy to work around. This is not blocking driver tests currently.

            Assignee:
            billy.donahue@mongodb.com Billy Donahue
            Reporter:
            kevin.albertson@mongodb.com Kevin Albertson
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: