[SERVER-31010] Server Selection localThresholdMS enhancement Created: 08/Sep/17 Updated: 06/Dec/22 Resolved: 09/Nov/21 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | New Feature | Priority: | Major - P3 |
| Reporter: | Shakir Sadikali | Assignee: | Backlog - Service Architecture |
| Resolution: | Won't Do | Votes: | 1 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||
| Assigned Teams: |
Service Arch
|
||||
| Participants: | |||||
| Case: | (copied to CRM) | ||||
| Description |
|
From the documentation
If the prior average is denoted RTTt-1, then the new average (RTTt) is computed from a new RTT measurement (Xt) and a weighting factor (α) using the following formula:
The weighting factor is set to 0.2, which was chosen to put about 85% of the weight of the average RTT on the 9 most recent observations. Weighting recent observations more means that the average responds quickly to sudden changes in latency._ Would it be possible to have another similar configuration option to tell the client to observe query/operation response times over a window of time or over n executions, be able to specify a threshold for response times so that servers with best response times can be picked over slow responding servers? |
| Comments |
| Comment by Lauren Lewis (Inactive) [ 09/Nov/21 ] |
|
We haven’t heard back from you for some time, so I’m going to close this ticket. If this is still an issue for you, please provide additional information and we will reopen the ticket. |
| Comment by Mira Carey [ 07/Jun/19 ] |
|
I wanted to provide an update on a feature that provides some help in situations like the one above, albeit through a very different kind of mechanism. After Specifically, in pre-4.2 land, read preference targeting finds all eligible hosts, then selects one at random to execute a command against. After 4.2, read preference targeting will:
This should cause us to preferentially route requests to hosts with more ready connections (either because of a transient drop in performance/availability or because of heterogeneous capacity). This functionality is only in mongos for the moment (so you can't get it in a driver), and it isn't precisely what's described here, but it solves a similar set of use cases. It has some other benefits in allowing us to respond much quicker than a time windowing system would allow, as well as allowing for uneven distribution of work to heterogeneous servers (rather than being an all or nothing threshold) |
| Comment by Naga Mayakuntla [ 15/Sep/17 ] |
|
It would also help if an event can be generated to let application know that a node(s) were not considered because of high latency so that we can know certain node(s) had issue and maybe log/alert based on that. |
| Comment by Naga Mayakuntla [ 08/Sep/17 ] |
|
Example Scenario: 5 Node Cluster (2 in DC1, 2 in DC2, 1 DC3). Primary in DC1. If a Client in DC2 sends a request, the 2 nodes in DC2 are eligible. If one of the nodes in DC2 has a storage/disk i/o latency problem, the RTT calculation via Heartbeat will not be aware (because isMaster does not interact w/ disk). Therefore, the client will continue to consider a node that is consistently returning queries past the desired SLA). It's likely that localThresholdMs was meant to only consider network latency, but if it could be amended (or a new parameter added) to consider end-to-end execution ... |