-
Type:
Task
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Component/s: SDAM
-
Needed
-
Summary
Avoid clearing the connection pool when the server connection rate limiter triggers.
Motivation
When a driver is creating a new connection to an overloaded server and it rejects due to the ingress connection rate limiter, the driver will react by clearing the pool, closing the SDAM connection, and triggering an immediate SDAM check. This is bad for a few reasons: 1) the immediate SDAM check will need to create a new connection which will likely also fail for the same reason 2) the existing connections in the pool were healthy and there was no reason to clear them. Now the client needs to repopulate the connection pool which puts even more connection creation pressure on the already overloaded node.
In practice this behavior acts as a sort of bad circuit breaker which can shut off traffic to the overloaded server, after a time the server recovers, then hits the rate limit again which shuts off traffic again, leading to a potential meta stable failure mode.
Who is the affected end user?
Who are the stakeholders?
How does this affect the end user?
Are they blocked? Are they annoyed? Are they confused?
How likely is it that this problem or use case will occur?
Main path? Edge case?
If the problem does occur, what are the consequences and how severe are they?
Minor annoyance at a log message? Performance concern? Outage/unavailability? Failover can't complete?
Is this issue urgent?
Does this ticket have a required timeline? What is it?
Is this ticket required by a downstream team?
Needed by e.g. Atlas, Shell, Compass?
Is this ticket only for tests?
Does this ticket have any functional impact, or is it just test improvements?
Acceptance Criteria
What specific requirements must be met to consider the design phase complete?
- is related to
-
SERVER-108645 Back off between failed connection establishment attempts
-
- Needs Scheduling
-
-
SERVER-108644 Do not drop established connections in the pool when new establishments are rate limited
-
- Needs Scheduling
-