Ensure server selection does not bias towards overloaded server

XMLWordPrintableJSON

    • Needed
    • Hide

      Summary of necessary driver changes

      •  

      Commits for syncing spec/prose tests
      (and/or refer to an existing language POC if needed)

      •  

      Context for other referenced/linked tickets

      •  
      Show
      Summary of necessary driver changes   Commits for syncing spec/prose tests (and/or refer to an existing language POC if needed)   Context for other referenced/linked tickets  
    • $i18n.getText("admin.common.words.hide")
      Key Status/Resolution FixVersion
      CDRIVER-6126 Blocked
      CXX-3373 Blocked
      CSHARP-5759 Blocked
      GODRIVER-3676 Blocked
      JAVA-5984 Blocked
      NODE-7234 Blocked
      PYTHON-5618 Blocked
      PHPLIB-1732 Blocked
      RUBY-3716 Blocked
      RUST-2290 Blocked
      $i18n.getText("admin.common.words.show")
      #scriptField, #scriptField *{ border: 1px solid black; } #scriptField{ border-collapse: collapse; } #scriptField td { text-align: center; /* Center-align text in table cells */ } #scriptField td.key { text-align: left; /* Left-align text in the Key column */ } #scriptField a { text-decoration: none; /* Remove underlines from links */ border: none; /* Remove border from links */ } /* Add green background color to cells with FixVersion */ #scriptField td.hasFixVersion { background-color: #00FF00; /* Green color code */ } #scriptField td.willNotDo { background-color: #FF0000; /* Red color code */ } /* Center-align the first row headers */ #scriptField th { text-align: center; } Key Status/Resolution FixVersion CDRIVER-6126 Blocked CXX-3373 Blocked CSHARP-5759 Blocked GODRIVER-3676 Blocked JAVA-5984 Blocked NODE-7234 Blocked PYTHON-5618 Blocked PHPLIB-1732 Blocked RUBY-3716 Blocked RUST-2290 Blocked

      Summary

      Ensure server selection does not bias towards overloaded an server.

      Motivation

      The server ingress rate limiter plans to reject excess operations quickly when overloaded. In certain cases this will be problematic for our power of 2 random choices server selection algorithm based on operationCount (implemented in SPEC-1555) because it relies on the assumption that request latency will go up during overload. However, in this case request latency will go down which can lead to a lower operationCount on the overloaded server. The end result will be increased error rates as new requests are bias towards the already overloaded server.

      Who is the affected end user?

      Customers.

      How does this affect the end user?

      Potential for higher error rates during overload.

      How likely is it that this problem or use case will occur?

      Likely during partial overload, eg when only 1 out of 3 mongoses are overloaded.

      If the problem does occur, what are the consequences and how severe are they?

      Higher error rates during overload. Longer time to recovery.

      Is this issue urgent?

      Does this ticket have a required timeline? What is it?

      Is this ticket required by a downstream team?

      Needed by e.g. Atlas, Shell, Compass?

      Is this ticket only for tests?

      No.

      Acceptance Criteria

      What specific requirements must be met to consider the design phase complete?

            Assignee:
            Unassigned
            Reporter:
            Shane Harvey
            Jib Adegunloye Jib Adegunloye
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated: