[SERVER-7975] Allow replica set secondary selection to be reset Created: 18/Dec/12  Updated: 06/Dec/22  Resolved: 06/Jun/19

Status: Closed
Project: Core Server
Component/s: Internal Client, Sharding
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: eHarmony Matching Assignee: Backlog - Service Architecture
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
Assigned Teams:
Service Arch
Participants:

 Description   

When a client attempts to connect through a mongos, according to the documentation, mongos looks at all of the mongod instances with eligible ping times and picks one at random. Once the connection is made, it will remain in place until the client closes it or it fails, at which time it will transparently renegotiate with another mongod.

This behavior can lead to semi-pathological outcomes under the following circumstances:
1. Clients are long-running AND
2. Overall replica set load changes over time AND/OR
3. Mongod instances are restarted AND/OR
4. Ping times change over time

The above scenarios can lead to extremely unbalanced loads, with some replicas supporting many times the number of connections as others and operating with orders of magnitude higher CPU. This is because clients connected to the highly-loaded mongod instances have no opportunity to notice that there are other more-lightly-loaded instances available. As a result, the overall performance of the replica set, and the application, suffers.

Such outcomes could be significantly mitigated without a real "load-balancing" feature, simply by allowing clients to transparently renegotiate their mongod connections after a certain amount of time, just as they do in the case of failures. The existing randomization should be sufficient to keep the load close to balanced; the mongod instances just need an opportunity for clients to connect to them.



 Comments   
Comment by Mira Carey [ 06/Jun/19 ]

This pattern (of sticking to a mongod when negotiating targetting), hasn't been mongos behavior in quite a while. At least as far back as 3.6, almost all user operations have been routed through our newer connection pools via the task executor abstraction.

Comment by eHarmony Matching [ 15/Jul/13 ]

This issue is incorrectly filed under C++ Driver. It should be filed under Java Driver.

Comment by Stephen Lee [ 18/Dec/12 ]

I'm moving this to the Core Server project as a feature request.

Generated at Thu Feb 08 03:16:12 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.