Exponential backoff and jitter in retry loops

XMLWordPrintableJSON

    • $i18n.getText("admin.common.words.hide")
      Key Status/Resolution FixVersion
      CDRIVER-6092 In Code Review
      CXX-3342 Backlog
      CSHARP-5723 In Code Review
      GODRIVER-3658 Backlog
      JAVA-5956 Backlog
      NODE-7142 In Progress
      PYTHON-5528 Done
      PHPLIB-1719 Blocked
      RUBY-3706 Ready for Work
      RUST-2273 In Progress
      $i18n.getText("admin.common.words.show")
      #scriptField, #scriptField *{ border: 1px solid black; } #scriptField{ border-collapse: collapse; } #scriptField td { text-align: center; /* Center-align text in table cells */ } #scriptField td.key { text-align: left; /* Left-align text in the Key column */ } #scriptField a { text-decoration: none; /* Remove underlines from links */ border: none; /* Remove border from links */ } /* Add green background color to cells with FixVersion */ #scriptField td.hasFixVersion { background-color: #00FF00; /* Green color code */ } #scriptField td.willNotDo { background-color: #FF0000; /* Red color code */ } /* Center-align the first row headers */ #scriptField th { text-align: center; } Key Status/Resolution FixVersion CDRIVER-6092 In Code Review CXX-3342 Backlog CSHARP-5723 In Code Review GODRIVER-3658 Backlog JAVA-5956 Backlog NODE-7142 In Progress PYTHON-5528 Done PHPLIB-1719 Blocked RUBY-3706 Ready for Work RUST-2273 In Progress

      As part of DRIVERS-3160 Client Backpressure, we plan to make retry loops use exponential backoff and jitter to reduce the load on the server and improve goodput. Retryable reads and writes only retry once by default but can perform multiple when CSOT is enabled. The convenient transaction api (DRIVERS-1934) will also retry multiple times. These retry loops should share a common backoff and jitter policy.

      We also add an adaptive token bucket to limit load amplification during peak overload. Each time a client makes a successful request (ok:1 or a successful error like DuplicateKeyError) it deposits a fractional “token” into a bucket. Each time a request fails (ok:0 with SystemOverloaded error), the client performs retries as normal (with exponential backoff + jitter) as long as there are whole tokens available in the bucket. This approach establishes a limited memory for the operational conditions of the upstream service: if there are tokens available for retry, then the service has been healthy recently.

            Assignee:
            Noah Stapp
            Reporter:
            Shane Harvey
            Jib Adegunloye Jib Adegunloye
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated: