-
Type: New Feature
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: None
-
None
-
Go Drivers
-
Not Needed
-
Context
Currently the Go Driver (and all drivers that implement CSOT) close the in-use connection when an operation times out. The reason drivers close the connection is that the connection cannot immediately run another command until the previous one completes (the driver can send another command before receiving the previous reply, but the server processes commands on a single connection sequentially). Additionally, closing the connection signals to the server to stop working on the command after the timeout, preventing it from doing unnecessary work.
However, closing connections can create issues when there are lots of timeouts. When the driver closes a connection, it will typically need to create another one to replace it to continue serving the same operation throughput. If there are lots of timeouts resulting in lots of connections being closed, the driver will also end up creating a lot of new connections, adding to driver and server load. When there are already lots of operation timeouts, adding load may result in even more timeouts, resulting in even more load, etc, creating a feedback loop of increasing load and timeouts.
To prevent that situation, we should not close the in-use connection when an operation times out. Instead, we should continue trying to read the command response after the operation times out (up to a maximum wait time). In the Go Driver, a simple way to implement that is to create a new goroutine to read the command response before checking it back into the pool. To limit the amount of time the connection is is waiting for a response, we should also send maxTimeMS with all commands (except Find and Aggregate operations, see DRIVERS-2722).
Definition of done
What must be done to consider the task complete?
Pitfalls
What should the implementer watch out for? What are the risks?
- is related to
-
GODRIVER-3151 Apply operation-level timeouts to maxTimeMS
- Closed
-
DRIVERS-2884 CSOT avoid connection churn when operations timeout
- In Progress
-
GODRIVER-3152 Set maxTimeMS to minimize connection churn
- Closed
- related to
-
GODRIVER-2944 Support CSOT spec timeoutMode for non-tailable cursors
- Backlog
-
DRIVERS-2971 Read server responses after client-side timeouts
- Needs Triage
-
GODRIVER-3181 Port "Read responses in the background after an operation timeout" to master
- Closed
-
GODRIVER-3193 Don't use "background reads" when the CSOT doesn't send "maxTimeMS"
- Closed