[SERVER-50735] Mongos 4.4.0 can return the topologyVersion of a shard in state change errors Created: 02/Sep/20  Updated: 06/Dec/22

Status: Backlog
Project: Core Server
Component/s: None
Affects Version/s: 4.4.0
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Shane Harvey Assignee: Backlog - Service Architecture
Resolution: Unresolved Votes: 0
Labels: sa-remove-fv-backlog-22
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File reproMongosWrongTopologyVersion.js    
Issue Links:
Related
is related to SERVER-50549 Transform connection-related error co... Closed
Assigned Teams:
Service Arch
Operating System: ALL
Sprint: Repl 2020-10-05
Participants:

 Description   

Mongos can return the wrong topologyVersion in state change error responses to the client. Instead of returning its own topologyVersion, mongos can return the topologyVersion of a shard member in some error responses.

I've attach a reproduction script using failCommand (reproMongosWrongTopologyVersion.js). It's also possible to trigger the same bug using a real shutdown or replSetStepDown event. To reproduce:

  1. Start a cluster. Mine has 1 mongos and a 1 member shard.
  2. Note the topologyVersion reported by the mongos.
  3. Insert into test.test
  4. Run an operation on mongos, my example uses findAndModify but I imagine other commands will also reproduce.
  5. At the same time, cause the mongos-mongod operation to fail. This can be done with failCommand, shutdown:1, etc...
  6. See the operation fails and mongos returns a error with the incorrect topologyVersion.

The attach repro script's output (notice the different topologyVersion fields both reported by mongos):

mongos topologyVersion:  {
	"processId" : ObjectId("5f501dd34f1464b2cf98116a"),
	"counter" : NumberLong(0)
}
 
...
 
uncaught exception: Error: findAndModifyFailed failed: {
	"topologyVersion" : {
		"processId" : ObjectId("5f501dcf33e67a0de9b4ab21"),
		"counter" : NumberLong(8)
	},
	"ok" : 0,
	"errmsg" : "Failing command due to 'failCommand' failpoint",
	"code" : 11602,
	"codeName" : "InterruptedDueToReplStateChange",
	"operationTime" : Timestamp(1599086037, 28),
	"$clusterTime" : {
		"clusterTime" : Timestamp(1599086037, 28),
		"signature" : {
			"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
			"keyId" : NumberLong(0)
		}
	}
} :
_getErrorWithCode@src/mongo/shell/utils.js:25:13
DBCollection.prototype.findAndModify@src/mongo/shell/collection.js:730:15
DBCollection.prototype.findOneAndReplace@src/mongo/shell/crud_api.js:833:12
@reproMongosWrongTopologyVersion.js:27:11
failed to load: reproMongosWrongTopologyVersion.js
exiting with code -3

It's possible this could be fixed by SERVER-50549 but I wanted to call out this bug separately.



 Comments   
Comment by Ratika Gandhi [ 09/Feb/21 ]

We will revisit the ticket after SERVER-50549 is complete. 

Comment by Tess Avitabile (Inactive) [ 22/Sep/20 ]

I agree it's a problem for mongos to forward state change errors from shards to drivers. However, I don't think appending the mongos's topologyVersion is the right way to solve this problem. It's an important aspect of the streamable hello protocol that a node only attach its topologyVersion for state change errors that change its topologyVersion. If mongos attaches topologyVersion to shutdown errors in 4.4, for example, then drivers will incorrectly ignore those shutdown errors, even though the mongos is shutting down. I would rather solve SERVER-50549 in a different way, by distinguishing errors from the shard vs the mongos itself, rather than using topologyVersion to say that errors should be ignored.

Comment by Divjot Arora (Inactive) [ 22/Sep/20 ]

I have some concerns about mongos never returning topologyVersion for command errors. In drivers, an omitted topologyVersion means that the response is never considered stale, so it will be processed. Because the errors propagated from mongod's can be state change errors, this would cause drivers to mark the mongos Unknown due to a state change in one of the shards. This could potentially happen multiple times for the same state change if there are multiple concurrent operations running on the shard.

This concern will probably be alleviated by SERVER-50549, but I wanted to bring it up here as well because I don't understand why it's incorrect for a mongos to always add its topologyVersion to responses. My understanding is that topologyVersion is a way to communicate how new/old a server is to drivers so they can decide if an error or heartbeat response requires processing. If the monogs were to always append its own version, drivers would never mark it Unknown for state changes in shards because the version wouldn't be considered newer than the one the driver is already aware of.

Comment by Tess Avitabile (Inactive) [ 22/Sep/20 ]

Thank you!

Comment by Pavithra Vetriselvan [ 22/Sep/20 ]

The changes should be relatively straightforward, so I can do it this sprint!

Comment by Tess Avitabile (Inactive) [ 22/Sep/20 ]

Cool, sounds good! I think we should fix the issue this sprint if we can. Would it work for you to implement the fix this sprint? If it would be tough to fit in with your project plans this sprint, I can ask someone else.

Comment by Pavithra Vetriselvan [ 22/Sep/20 ]

Ah, yes that makes sense to me. Good call, Tess! In that case, I think we would have to check here and here to see if the response contains a topologyVersion. If so, we remove it. This logic should probably be implemented in a helper function, perhaps called removeTopologyVersionFromResponse.

Comment by Tess Avitabile (Inactive) [ 22/Sep/20 ]

Thanks, pavithra.vetriselvan and shane.harvey!

shane.harvey, when the driver receives a topologyVersion as part of a command error, does it store that topologyVersion and use it in its next isMaster command to the mongos? If so, the next isMaster command to the mongos will return immediately, since the processId will be wrong. I don't think that's terribly harmful, but it's unexpected behavior.

pavithra.vetriselvan, I think that in 4.4, mongos should never return its own topologyVersion as part of a command error. A node should only return its topologyVersion as part of a command error if that error is associated with a change in that node's topologyVersion. Otherwise, the driver will incorrectly ignore the command error. Post-4.4, the mongos should only return its own topologyVersion when in quiesce mode. So I think that the mongos should strip out the topologyVersion of the mongod from the response, but it should not append its own. Do you know the right place for us to strip out the topologyVersion of the mongod?

Comment by Pavithra Vetriselvan [ 21/Sep/20 ]

Got it, thanks for the explanation Shane!

Comment by Shane Harvey [ 21/Sep/20 ]

Thanks pavithra.vetriselvan. I believe that the only impact of returning the wrong topologyVersion is that clients which get multiple errors at the same time may mark the server unknown and clear the connection pool multiple times. In other words, it negates some of the benefit of DRIVERS-1187.

Generated at Thu Feb 08 05:23:26 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.