[DRIVERS-1390] Clarify that connection checkout for the initial attempt should not be retryable Created: 08/Sep/20  Updated: 25/Aug/23  Resolved: 09/Jun/21

Status: Closed
Project: Drivers
Component/s: Retryability
Fix Version/s: None

Type: Bug Priority: Minor - P4
Reporter: Divjot Arora (Inactive) Assignee: Unassigned
Resolution: Declined Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to DRIVERS-746 Drivers should retry operations if co... Implementing
Driver Changes: Not Needed

 Description   

The retryable writes pseudocode does not mention what should happen if there's an error getting a connection from the selected server's pool. The retryable reads pseudocode explicitly calls connection = server.getConnection at the very beginning and does not surround this call in a try/catch, so errors here will be propagated without retries. Also, the prose describing the pseudocode in both specs does not mention this case.

From talking to Jeremy, it seems like we want to propagate connection pool checkout errors to the user without any retries and this should be clarified in both the pseudocode and prose in both specs.

CC jmikola



 Comments   
Comment by Rachelle Palmer [ 09/Jun/21 ]

Based on requests from users, we WILL retry authentication/connection errors as per DRIVERS-746.

Comment by Divjot Arora (Inactive) [ 12/Oct/20 ]

shane.harvey I just filed this ticket to clarify because it seems like the specs are out of sync and the pseudocode can be improved. If we're changing this behavior in DRIVERS-746, I'm fine closing out this ticket in favor of that, but it's not clear to me whether that ticket is being prioritized based on the comments. Any ideas on what we want to do here?

Comment by Shane Harvey [ 12/Oct/20 ]

Wouldn't it be more robust to retry after errors on the connection checkout? Retrying is the behavior that is proposed in DRIVERS-746.

For example, imagine that an operation selects a server that is about to shutdown. If we already have a connection to that node we will get a retryable error when executing the operation. If we don't have a connection, the pool will automatically create one which will also fail. It seems odd to not retry in the second case. Not retrying would make drivers less robust in the face of Atlas maintenance.

Comment by Jeremy Mikola [ 09/Sep/20 ]

Note that neither https://github.com/mongodb/specifications/blob/master/source/retryable-writes/retryable-writes.rst#executing-retryable-write-commands nor https://github.com/mongodb/specifications/blob/master/source/retryable-reads/retryable-reads.rst#selecting-the-initial-server discuss the connection pool – so they should also be updated to be consistent with the pseudo-code in both specs.

Additionally, the retryable reads pseudo-code includes return executeCommand(server, command);, which may be a typo. Elsewhere in its pseudo-code, connection is passed instead of server.

Feel free to address all of these in a PR.

Generated at Thu Feb 08 08:23:19 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.