[SERVER-35132] Regression: all connections to {{mongos}} forced to reconnect during failover for clients with tight deadlines Created: 21/May/18 Updated: 29/Oct/23 Resolved: 24/Jan/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | 3.2.20, 3.4.15, 3.6.4 |
| Fix Version/s: | 4.1.8 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Gregory Banks | Assignee: | Mathias Stearn |
| Resolution: | Fixed | Votes: | 1 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Backwards Compatibility: | Fully Compatible |
| Operating System: | ALL |
| Steps To Reproduce: |
|
| Sprint: | Service Arch 2019-01-14, Service Arch 2019-01-28 |
| Participants: |
| Description |
|
For clients of mongos that have tight deadlines, such as those that expect all queries to take less than 1s and who have maxTimeMS and socketTimeout set appropriately (1s and 2s respectively in our testing), a failover will force all connections from the client bound for the shard in transition to close and be reestablished. This can be problematic for environments with lots of connections (in addition to high throughput) as establishing connections can be expensive (e.g., thread create/destroy per connection). It should be noted that while setting socketTimeout to be greater than the failover period would allow existing connections to persist rather than time out, this is also not a solution. Instead this either causes the app to excessively queue operations on its side waiting for existing connections to free up or open new connections in the interim to service those operations, again consuming excessive resources on mongos's side and inhibiting the timely feedback required by the application. Prior to 3.2, this was not an issue as mongos would immediately pass back ReplicaSetMonitor no master found for set errors to the client, allowing it to decide how to handle retries while reusing existing connections. Since 3.2, however, the client connection to mongos will hang while trying to find an acceptable replica set member until its configured timeout (20s in versions >= 3.4 and 11s in version 3.2), or until an acceptable member becomes available, with no way to control that timeout.
Here we see the default time to wait for an acceptable server is 11s, that the intention is to allow the client to influence this time (e.g., This value is used if the operation doesn't have a user-specified max wait time.) presumably using $maxTimeMS, and that this is to be implemented (e.g., TODO: Get remaining max time from 'txn'). We also see this acknowledged in other parts of the code in 3.2 (see src/mongo/s/query/async_results_merger.cpp):
This code gets executed via the following code path:
In versions >= 3.4, RemoteCommandTargeter::selectFindHostMaxWaitTime disappears, but the problem remains and is instead hard-coded to 20s in various places (see src/mongo/s/query/async_results_merger.cpp):
We see further evidence of the intention to fix this issue in src/mongo/client/remote_command_targeter_rs.cpp:
And in src/mongo/client/remote_command_targeter.h, as mentioned in the previous comment:
The code path is only slightly different in 3.4:
As noted above, this appears to be the same behavior in 3.6, although we have not tested this behavior with 3.6 yet. Given the developer comments and current undesirable behavior, we would like to see this issue addressed and/or understand what roadblocks are currently preventing implementation of a solution. |
| Comments |
| Comment by Githook User [ 24/Jan/19 ] |
|
Author: {'username': 'benety', 'email': 'benety@mongodb.com', 'name': 'Benety Goh'}Message: |
| Comment by Mathias Stearn [ 24/Jan/19 ] |
|
The work done in |
| Comment by Githook User [ 23/Jan/19 ] |
|
Author: {'email': 'mathias@10gen.com', 'name': 'Mathias Stearn', 'username': 'RedBeard0531'}Message: |
| Comment by Ramon Fernandez Marina [ 31/May/18 ] |
|
Thanks for your detailed report gregbanks, this ticket has been sent to the Sharding Team for evaluation. Regards, |