[DRIVERS-2185] withTransaction should not retry after server selection timeout errors Created: 02/Feb/22  Updated: 31/Mar/22

Status: Backlog
Project: Drivers
Component/s: Transactions
Fix Version/s: None

Type: Spec Change Priority: Unknown
Reporter: Shane Harvey Assignee: Unassigned
Resolution: Unresolved Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to JAVA-4453 ClientSession.withTransaction doesn't... Closed
Driver Changes: Needed

 Description   

Summary

The withTransaction helper retries the callback even when the previous callback attempt fails with a server selection error. The issue is that server selection errors in the callback are labelled transient:

Any command error that includes the "TransientTransactionError" error label in the "errorLabels" field. Any network error or server selection error encountered running any command besides commitTransaction in a transaction. 

Motivation

During a prolonged outage, withTransaction may continue to retry the callback unsuccessfully for 120 seconds. This was encountered by a customer in JAVA-4453.

How does this affect the end user?

This behavior is confusing since we usually do not retry server selection errors. It also leads to long delays with little feedback to the user.

How likely is it that this problem or use case will occur?

It will occur if the cluster becomes unavailable during a withTransaction call.

Is this issue urgent?

Not urgent.

Is this ticket required by a downstream team?

No.

Is this ticket only for tests?

No.



 Comments   
Comment by Durran Jordan [ 03/Feb/22 ]

alexander.golin can we schedule this?

Generated at Thu Feb 08 08:24:56 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.