Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-50735

Mongos 4.4.0 can return the topologyVersion of a shard in state change errors



    • Bug
    • Status: Backlog
    • Major - P3
    • Resolution: Unresolved
    • 4.4.0
    • None
    • None
    • Service Arch
    • ALL
    • Repl 2020-10-05


      Mongos can return the wrong topologyVersion in state change error responses to the client. Instead of returning its own topologyVersion, mongos can return the topologyVersion of a shard member in some error responses.

      I've attach a reproduction script using failCommand (reproMongosWrongTopologyVersion.js). It's also possible to trigger the same bug using a real shutdown or replSetStepDown event. To reproduce:

      1. Start a cluster. Mine has 1 mongos and a 1 member shard.
      2. Note the topologyVersion reported by the mongos.
      3. Insert into test.test
      4. Run an operation on mongos, my example uses findAndModify but I imagine other commands will also reproduce.
      5. At the same time, cause the mongos-mongod operation to fail. This can be done with failCommand, shutdown:1, etc...
      6. See the operation fails and mongos returns a error with the incorrect topologyVersion.

      The attach repro script's output (notice the different topologyVersion fields both reported by mongos):

      mongos topologyVersion:  {
      	"processId" : ObjectId("5f501dd34f1464b2cf98116a"),
      	"counter" : NumberLong(0)
      uncaught exception: Error: findAndModifyFailed failed: {
      	"topologyVersion" : {
      		"processId" : ObjectId("5f501dcf33e67a0de9b4ab21"),
      		"counter" : NumberLong(8)
      	"ok" : 0,
      	"errmsg" : "Failing command due to 'failCommand' failpoint",
      	"code" : 11602,
      	"codeName" : "InterruptedDueToReplStateChange",
      	"operationTime" : Timestamp(1599086037, 28),
      	"$clusterTime" : {
      		"clusterTime" : Timestamp(1599086037, 28),
      		"signature" : {
      			"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
      			"keyId" : NumberLong(0)
      } :
      failed to load: reproMongosWrongTopologyVersion.js
      exiting with code -3

      It's possible this could be fixed by SERVER-50549 but I wanted to call out this bug separately.


        Issue Links



              backlog-server-servicearch Backlog - Service Architecture
              shane.harvey@mongodb.com Shane Harvey
              0 Vote for this issue
              11 Start watching this issue