|
The driver timeout is infinity by default, which makes sense because the driver retries only "retryable" errors: those we judge likely to succeed after a few retries. I believe (without evidence) that a transaction which runs longer than transactionLifetimeLimitSeconds on its first try will probably do so on every try, so it shouldn't be labeled retryable.
Other things we label retryable have temporary causes: failover, write conflict, .... The cause of a too-long transaction is often permanent: the client is trying to do too much work in one transaction.
|
|
There are two timeouts at play here, the driver timeout and the server timeout. The driver timeout is set by an application developer saying "I want this request to try for this long". The server timeout is set by a database operator saying "we need to kill transactions that take longer than X time to fairly share and appropriately utilize resources". From my perspective, we should be using the driver's timeout here, and retrying on timeout errors on the server, in case that timeout doesn't happen again. I think there are many cases, especially around lock contention, where a retry would succeed even if the original operation timed out. That said, that would probably be a stronger argument for getting rid of the server side timeout, and retrying work the server just threw away of its own accord does feel wasteful.
|