-
Type: Bug
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Component/s: Retryability
-
None
-
Needed
-
-
(copied to CRM)
This ticket is intended to track the work required to make all drivers resilient to connection handshake network errors with respect to retryable reads and writes.
Currently, the retryable reads and retryable writes specifications do not require that operations are retried in the following case:
- the driver successfully selects a server for the operation
- there is no idle connection already available in the pool
- the driver attempts to open a new connection to the server and complete the connection handshake
- the connection handshake fails with a network error
Instead, the specifications allow drivers to fail the operation without retrying, even though retrying would be safe in this case for both reads and writes.
One could read these specifications such that network errors during a connection handshake fall under the definition of retryable error, but the specifications are not explicit that this applies to the handshake and in practice multiple drivers do not interpret it that way, and there are no tests defined to assert the behavior.
Similarly, the server selection spec states that
After a server is selected, several error conditions could still occur that make the selected server unsuitable for sending the operation, such as:
- the server could have shutdown the socket (e.g. a primary stepping down),
- a connection pool could be empty, requiring new connections; those connections could fail to connect or could fail the server handshake
This specification does not require nor prohibit drivers from attempting automatic recovery for various cases where it might be considered reasonable to do so, such as:
- repeating server selection if, after selection, a socket is determined to be unsuitable before a message is sent on it
but note it's not a MUST in the specification so in practice drivers differ in the behavior.
- is related to
-
DRIVERS-1842 Drivers should retry authentication errors when connection handshake fails
- Backlog
-
DRIVERS-1390 Clarify that connection checkout for the initial attempt should not be retryable
- Closed
-
DRIVERS-2247 Add tests for non-retryable handshake errors
- Backlog
-
DRIVERS-2140 Clarify Auth Spec and Clean Up Error Section
- Backlog
- related to
-
DRIVERS-2032 Clarify server pinning behavior and pausable pool workflow
- Backlog
-
DRIVERS-2489 Improve test coverage for retryable handshake errors
- Implementing
- split to
-
GODRIVER-2191 Drivers should retry operations if connection handshake fails
- Blocked
-
CDRIVER-4192 Drivers should retry operations if connection handshake fails
- Closed
-
CSHARP-3919 Drivers should retry operations if connection handshake fails
- Closed
-
RUBY-2815 Drivers should retry operations if connection handshake fails
- Closed
-
RUST-1064 Drivers should retry operations if connection handshake fails
- Closed
-
JAVA-4354 Drivers should retry operations if connection handshake fails
- Closed
-
MOTOR-836 Drivers should retry operations if connection handshake fails
- Closed
-
PYTHON-2951 Drivers should retry operations if connection handshake fails
- Closed
-
CXX-2393 Drivers should retry operations if connection handshake fails
- Closed
-
NODE-3688 Drivers should retry operations if connection handshake fails
- Closed
-
PHPLIB-1042 Drivers should retry operations if connection handshake fails
- Closed