Prevent ergress connection churn caused by maxTimeMS expiration and cancellation

    • Type: Improvement
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • Networking & Observability
    • None
    • 3
    • TBD
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      mongos utilizes the maxTimeMS provided by the request to set a deadline for the processing of that request, including any egress networking it might do as part of that. If the deadline is hit while mongos is performing networking, the egress work will be canceled immediately, often resulting in the connection involved needing to be closed. This cancellation is implemented in two different ways:

      During periods of high load when timeouts may be more common, this churning of connections can induce a feedback loop that leads to more timeouts, more churn, more work for the reactor, and ultimately unavailability. We should update mongos' egress networking to enforce deadlines and timeouts without sacrificing egress connections to reduce this risk of unavailability and make the connection pool a more reliable way to constrain concurrency on mongos.

            Assignee:
            Unassigned
            Reporter:
            Patrick Freed
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated: