[SERVER-50549] Transform connection-related error codes in proxied commands Created: 26/Aug/20 Updated: 08/Jan/24 Resolved: 28/May/21 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Internal Code |
| Affects Version/s: | None |
| Fix Version/s: | 4.4.7, 5.0.0-rc1, 4.2.17, 5.1.0-rc0 |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Billy Donahue | Assignee: | Billy Donahue |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | apm-issue, post-rc0 | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||||||||||||||
| Backport Requested: |
v5.0
|
||||||||||||||||||||||||||||||||||||
| Sprint: | Service arch 2020-09-07, Service arch 2020-10-05, Service arch 2020-11-02, Service arch 2020-11-16, Service arch 2020-12-28, Service Arch 2021-02-22, Service Arch 2021-03-08, Service Arch 2021-03-22, Service Arch 2021-04-05, Service Arch 2021-04-19, Service Arch 2021-05-17, Service Arch 2021-05-31 | ||||||||||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||||||||||
| Case: | (copied to CRM) | ||||||||||||||||||||||||||||||||||||
| Linked BF Score: | 166 | ||||||||||||||||||||||||||||||||||||
| Description |
|
Implementing a idea from the comments of SERVER-50459. Some mongos-proxied or intracluster operations will have their error codes returned verbatim to the originating client. In some cases the error codes are related to connection health, and will be inappropriate for forwarding. The example we're using is InterruptedAtShutdown, but there are other members of this class of errors. This is because the client's driver is required to break connection when these codes are received, but really the error code refers to a proxy-to-internal connection, not the originating-client-to-proxy connection. Obviously error codes that report on the status of a connection must be interpreted in the proper context. Client drivers currently assume that these errors refer to the client-proxy connection. We've decided to go with a simple improvement. Proxy connections will translate connection-related errors into a new different error which, as a new code, has no behavioral effect on drivers receiving it. It's still undecided whether there will be one new error code or many. |
| Comments |
| Comment by Githook User [ 09/Sep/21 ] | ||||||||
|
Author: {'name': 'Billy Donahue', 'email': 'billy.donahue@mongodb.com', 'username': 'BillyDonahue'}Message: (cherry picked from commit 4a311668ac834b6e363a44c9f00cad9d4790288f) | ||||||||
| Comment by Githook User [ 11/Jun/21 ] | ||||||||
|
Author: {'name': 'Billy Donahue', 'email': 'billy.donahue@mongodb.com', 'username': 'BillyDonahue'}Message: (cherry picked from commit 4a311668ac834b6e363a44c9f00cad9d4790288f) | ||||||||
| Comment by Githook User [ 08/Jun/21 ] | ||||||||
|
Author: {'name': 'Billy Donahue', 'email': 'billy.donahue@mongodb.com', 'username': 'BillyDonahue'}Message: (cherry picked from commit 4a311668ac834b6e363a44c9f00cad9d4790288f) | ||||||||
| Comment by Githook User [ 28/May/21 ] | ||||||||
|
Author: {'name': 'Billy Donahue', 'email': 'billy.donahue@mongodb.com', 'username': 'BillyDonahue'}Message: | ||||||||
| Comment by Githook User [ 11/May/21 ] | ||||||||
|
Author: {'name': 'Sviatlana Zuiko', 'email': 'sviatlana.zuiko@mongodb.com', 'username': 'szuiko'}Message: Revert " This reverts commit 1e10e32780ca47969342f6b52bfe09bc4134831e. | ||||||||
| Comment by Githook User [ 10/May/21 ] | ||||||||
|
Author: {'name': 'Billy Donahue', 'email': 'billy.donahue@mongodb.com', 'username': 'BillyDonahue'}Message: | ||||||||
| Comment by Billy Donahue [ 05/May/21 ] | ||||||||
|
This is ready to go! | ||||||||
| Comment by Billy Donahue [ 07/Apr/21 ] | ||||||||
|
Just an update here as it's been a while and I'm switched back to it. The pseudocode from Divjot's most recent comment has been implemented. I'm testing this code, and it looks like it's working as it's intended to work. https://github.com/mongodb/mongo/compare/master...BillyDonahue:SERVER-50549 Several tests are red because they're not expecting mongos to rewrite anything. Hopefully these tests can be quickly adapted as part of the finishing touches phase after we check that this behavioral change solves the problem. Yes isabel.atkinson I believe the intent is to backport this change to 4.0. | ||||||||
| Comment by Isabel Atkinson [ 17/Feb/21 ] | ||||||||
|
When this fix is implemented, would it be possible to backport it to 4.0? Per discussion with other drivers team members, we have decided not to implement the spec changes detailed DRIVERS-1376 as this ticket should fix the underlying issues, so it would be beneficial to have this change available to users on older server versions who are experiencing issues with errors from proxies. | ||||||||
| Comment by Billy Donahue [ 16/Nov/20 ] | ||||||||
|
Divjot thanks for doing that testing. I could have saved you some time by describing in this ticket the exact algorithm we're implementing, as it was spoon fed to me. In the interest of expediency, I didn't front-load this project by reading the SDAM spec, so I'm at a disadvantage. I feel like it would be a good time to regroup and capture requirements for the next round. Let's see if we can work out an exact description of what we need. For reference, this is the first draft of the algorithm that you tested:
We revised this thanks to Shane and Pavi's excellent observations about Quiesce mode, such that the getKillAllOperations() changes to getKillAllOperations(sc) || quiesceModeActive(sc). Now we need to change a little further. "would imply that the server is shutting down" is not good enough. We need to also rewrite any retryable errors. I'm not sure how this is going to work. Obviously shutdown errors should be rewritten if we're not locally shutting down. That made sense. But other retriables would need a corresponding check to determine if they're locally applicable. I'd like to nail that down. | ||||||||
| Comment by Divjot Arora (Inactive) [ 16/Nov/20 ] | ||||||||
|
At the very least, I think this has to cover all state change errors (i.e. errors which force drivers to mark servers Unknown and/or clear connection pools). Otherwise, it won't address the 4.4 regression. We could consider covering all retryable errors if mongos performs its own retries, but I don't think that was the main motivation for requesting this change. | ||||||||
| Comment by Benjamin Caimano (Inactive) [ 16/Nov/20 ] | ||||||||
|
pavithra.vetriselvan, I'm hoping to see this cover all retriables honestly. Still, we have no scope for this work, so we're working on what is possible and what can reach consensus informally. Unfortunately, we can't rule out that mongod <-> mongod connections can forward errors because of some query magic. | ||||||||
| Comment by Pavithra Vetriselvan [ 16/Nov/20 ] | ||||||||
|
divjot.arora I agree that the current solution in CR does not completely solve the original problem filed in this ticket since we're not rewriting recovering error codes or not master error codes. My suggestion was just for shutdown errors. ben.caimano billy.donahue, is it possible that this patch evolves into rewriting all retryable errors codes in responses from mongos? Alternatively, if we don't extend this patch to cover more error codes, I assume that work will be covered in PM-2100. I wonder if it makes more sense to scope out the work formally in that project since there seems to be many moving parts. Interested to hear thoughts! Divjot, as for your last statement, I think this issue is unique to a mongos <--> mongod connection and don't think it's possible for mongod to return errors from other mongod's. | ||||||||
| Comment by Billy Donahue [ 13/Nov/20 ] | ||||||||
|
Offline discussion with Pavi, thanks. Answered some questions I had above. Quiesce mode is exclusively a precursor to shutdown. It should be treated the same as we'd treat shutdown, for the purposes of this change. Some clarification yielded a tweak to the existing CR. The quiescing server doesn't mind if the client starts to think of it as being in shutdown. An outgoing message indicating a shutdown should be left alone if we're in either quiesce mode OR shutdown mode. "Is there a simple predicate to detect whether we're in quiesce mode" ? Not so simple, because it's kept in a different coordinator in mongos than on mongod. However, a ServiceContext decoration could handle this in an abstract way. These coordinators can both populate the abstract decorator slot when they come online. It's some extra work, but it will be reusable effort. I think we'll need this if we move the message rewriting code into a smarter and more efficient place instead of the Message reparsing we're awkwardly doing in service_state_machine in my CR. (notes for myself from Pavi stored here for my own reference mongos quiesce mode is kept here: Same concept on mongod: | ||||||||
| Comment by Pavithra Vetriselvan [ 13/Nov/20 ] | ||||||||
|
Yes, quiesce mode will always precede a shutdown. Quiesce mode is maintained by the MongosTopologyCoordinator on monogos and the ReplicationCoordinatorImpl on mongod. Both are accessible via the ServiceContext, which is actually how they both enter quiesce mode here (mongos) and here (mongod). I agree with your predicate that "if we're shutting down OR in quiesce mode," don't rewrite anything on its way out. If we have access to the ServiceContext, we should be able to tell whether or not we're in quiesce mode. | ||||||||
| Comment by Pavithra Vetriselvan [ 13/Nov/20 ] | ||||||||
|
billy.donahue Ah, thank you for clarifying! Yes, I think it's a requirement that we should not be rewriting ShutdownInProgress error codes from a quiescing mongos. Otherwise the drivers wouldn't know to clear the connection pool for a mongos that's shutting down. shane.harvey, can you confirm that my assumption is correct? Another thing to note is that after mongos enters quiesce mode, it will increment its topologyVersion. We will attach the new topologyVersion to ShutdownErrors (both ShutdownInProgress and InterruptedAtShutdown) throughout the rest of the shutdown process. This ensures that if a driver has received a hello response with ok:0 and error code ShutdownInProgress, then it can ignore future ShutdownErrors that have the same topologyVersion. So I think the key here is that we should be distinguishing whether shutdown errors between locally-generated ShutdownErrors and proxied ones. Do y'all agree? | ||||||||
| Comment by Billy Donahue [ 13/Nov/20 ] | ||||||||
|
pavithra.vetriselvan, in my change, there is no distinction between locally-generated ShutdownError code and proxied ones. I believe you are saying a Quiesce-mode mongos will start locally emitting ShutdownInProgress, a ShutdownError code, even though the global ServiceContext is not in the KillAllOperations state. >> However, this error comes from the mongos itself and not a proxied connection, so it should not be overwritten. When you say "should not", I can't tell whether you're stating a requirement or a belief. My code as written doesn't care whether the message is from a proxy connection or not, so it sounds like it's not going to do what's needed. Would it be a problem if a quiescing mongos emitted HostUnreachable instead of InterruptedAtShutdown? Is Quiesce mode temporary or is it always meant to precede a shutdown? Is there a simple predicate to detect whether we're in quiesce mode, so we can maybe just have a rule that if we're shutting down OR in quiesce mode, we don't rewrite anything that's on its way out? It sounds like a mongos that's in quiesce mode doesn't mind the client thinking it's shutting down. | ||||||||
| Comment by Pavithra Vetriselvan [ 12/Nov/20 ] | ||||||||
|
shane.harvey, Quiesce Mode is enabled before we call setKillAllOperations, which kills remaining running operations during shutdown. Operations at this point in the shutdown process will fail with InterruptedAtShutdown errors. When we "enter" quiesce mode, we basically just sleep for the quiesceTime and don't continue with the shutdown process. In this patch, I think a mongos in quiesce mode would rewrite remote shard errors in the same way as a non-quiescing mongos since quiesce mode does not affect new or in-progress operations (on an existing connection). The major change in behavior is that a quiescing mongos will respond to any incoming hello requests with "ShutdownInProgress". However, this error comes from the mongos itself and not a proxied connection, so it should not be overwritten. I think it would be interesting and simple enough to write a jstest that confirmed that we don't overwrite "ShutdownInProgress" errors from a quiescing mongos but that we do overwrite other "not master" errors originating from the shard itself. | ||||||||
| Comment by Shane Harvey [ 12/Nov/20 ] | ||||||||
|
We should also ensure that a mongos in Quiesce mode (PM-1703) also rewrites remote shard errors in the same way as a non-Quiesce mode mongos. Looking at the CR the error rewriting behavior is gated on getKillAllOperations():
pavithra.vetriselvan does Quiesce mode change getKillAllOperations()? Do you think the change should test Quiesce mode explicitly? | ||||||||
| Comment by Billy Donahue [ 08/Nov/20 ] | ||||||||
|
Code Review https://mongodbcr.appspot.com/712610012/ | ||||||||
| Comment by Billy Donahue [ 01/Oct/20 ] | ||||||||
|
pavithra.vetriselvan Wow another complication! This is already a pretty complicated project and the text escaping is one of the strangest parts of it. If we have to add some kind of version negotiation it's going to be harder still. But maybe that's a separate project? Re: "Have we confirmed that this behavior exists in previous supported versions?" My understanding is that as a recent improvement (>=4.4), mongos partially retries commands 3x before giving up and returning a retryable error code to the client. So it retries a little, but not completely. I'm still verifying this. Retry codes still leak back to the client under those conditions. Changing the retry behavior of mongos has not been part of this proposal. It's only been about remapping the error codes and message strings that reach mongos clients in those cases in which a client would misconstrue them as referring to the state of the client <=> mongos connection. I am not sure what implications this would have for PM-1718, which I haven't been following closely. It may just be that when mongos is at the point of transforming an error to suppress its "Unknown"-marking side effects, it needs to escape occurrences of "not master" OR "not primary" in those error messages. | ||||||||
| Comment by Shane Harvey [ 02/Sep/20 ] | ||||||||
|
While we're at it, could we also have mongos transform state-change errors reported through the "writeErrors" and "writeConcernError" fields? For example, when a shard's primary is shutdown while handling a write operation, mongos can return these kind of responses:
Mongos could return the transformed error instead:
| ||||||||
| Comment by Billy Donahue [ 26/Aug/20 ] | ||||||||
|
divjot.arora Let me try to confirm my understanding of what you're asking for. The concern is that mongos might currently return these retriable codes directly without itself retrying them. But maybe under those current circumstances, it's kind of okay because the originating client will retry them. It's just sometimes inefficient because the client will unnecessarily reseat its connection to mongos before retrying. The concern is that if we then change the error code from something retryable to something new, then the originating client won't retry them. So retryable operations won't be retried. Is that the investigation requested? | ||||||||
| Comment by Billy Donahue [ 26/Aug/20 ] | ||||||||
|
Some pointers from ben.caimano (Slack conversation):
|