This ticket is a placeholder to figure out between ourselves and with TSE what supportability and diagnosability enhancements we might need to include for retryable writes.
Things that come to mind include:
- FTDC metrics for number of active sessions
- Metrics for the CPU (and if possible I/O) overhead of maintaining the retryability metadata
- Metrics for number of writes which have been retried
- If we end up with a cache, metrics for hit-rate