-
Type: Improvement
-
Resolution: Fixed
-
Priority: Unknown
-
Affects Version/s: None
-
Component/s: None
-
None
As part of the Tesla connection failure investigation, we've identified a number of failure modes with the current Go driver connection pooling and timeout logic that only occur during high load:
- Empty connection pool can't recover if all operation timeouts are too low.
- Connections are almost guaranteed to be closed if waiting too long for a connection from the pool or connection creation if the operation timeouts are too low.
- If server latency spikes and many connections are closed, when the server recovers the connection pool doesn't recover because the application still has trouble creating connections if the operation timeouts are too low.
Configurations that seem to exacerbate the problem:
- TLS-encrypted connections
- OCSP verification without cert stapling
- Auth-handshake connections
Create an Evergreen task that can reproduce these scenarios by load testing against an Atlas deployment. Consider running the load test as part of the Atlas test suite.
- is depended on by
-
GODRIVER-2038 Use "ConnectionTimeout" for creating all new connections and background connection creation
- Closed