The withTransaction helper retries the callback even when the previous callback attempt fails with a server selection error. The issue is that server selection errors in the callback are labelled transient:
Any command error that includes the "TransientTransactionError" error label in the "errorLabels" field. Any network error or server selection error encountered running any command besides commitTransaction in a transaction.
During a prolonged outage, withTransaction may continue to retry the callback unsuccessfully for 120 seconds. This was encountered by a customer in
How does this affect the end user?
This behavior is confusing since we usually do not retry server selection errors. It also leads to long delays with little feedback to the user.
How likely is it that this problem or use case will occur?
It will occur if the cluster becomes unavailable during a withTransaction call.
Is this issue urgent?
Is this ticket required by a downstream team?
Is this ticket only for tests?