-
Type:
Spec Change
-
Resolution: Duplicate
-
Priority:
Unknown
-
None
-
Component/s: Retryability
-
None
-
Needed
-
Introduce an adaptive retry policy (based on token buckets) to limit load amplification during peak overload.
Clients will support an adaptive retry policy which aims to overcome metastable failures during overload. Each time a client makes a successful request (ok:1 or a successful error like DuplicateKeyError) it deposits a fractional “token” into a bucket. Each time a request fails (ok:0 with SystemOverloaded error), the client performs retries as normal (with exponential backoff + jitter) as long as there are whole tokens available in the bucket. This approach establishes a limited memory for the operational conditions of the upstream service: if there are tokens available for retry, then the service has been healthy recently.
- duplicates
-
DRIVERS-3239 Exponential backoff and jitter in retry loops
-
- Ready for Work
-
- is related to
-
PYTHON-5506 Prototype adaptive token bucket retry policy
-
- Closed
-
- related to
-
SERVER-108583 Add diagnostic metadata to identify retried commands
-
- Backlog
-
-
DRIVERS-3246 Make system overload errors easier to diagnose
-
- Backlog
-
-
DRIVERS-3241 Add diagnostic metadata to retried commands
-
- Blocked
-