Adaptive token bucket retry policy

XMLWordPrintableJSON

    • Type: Spec Change
    • Resolution: Unresolved
    • Priority: Unknown
    • None
    • Component/s: Retryability
    • None
    • Needed
    • Hide

      Summary of necessary driver changes

      •  

      Commits for syncing spec/prose tests
      (and/or refer to an existing language POC if needed)

      •  

      Context for other referenced/linked tickets

      •  
      Show
      Summary of necessary driver changes   Commits for syncing spec/prose tests (and/or refer to an existing language POC if needed)   Context for other referenced/linked tickets  

      Introduce an adaptive retry policy (based on token buckets) to limit load amplification during peak overload.

      Clients will support an adaptive retry policy which aims to overcome metastable failures during overload. Each time a client makes a successful request (ok:1 or a successful error like DuplicateKeyError) it deposits a fractional “token” into a bucket. Each time a request fails (ok:0 with SystemOverloaded error), the client performs retries as normal (with exponential backoff + jitter) as long as there are whole tokens available in the bucket. This approach establishes a limited memory for the operational conditions of the upstream service: if there are tokens available for retry, then the service has been healthy recently.

              Assignee:
              Unassigned
              Reporter:
              Shane Harvey
              Jib Adegunloye Jib Adegunloye
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated: