Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-61692

Reproduce connection storm behavior and try mitigating it with a load balancer

    • Type: Icon: Task Task
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Labels:
      None
    • Service Arch 2022-2-07, Service Arch 2022-2-21, Service Arch 2022-03-07, Service Arch 2022-03-21
    • 10

      We should write an ad hoc test that reproduces connection storms, and then try running the same tests with an L4 load balancer in between clients and mongos to see whether using a load balancer prevents a connection storm, as we intend it to. As part of this we should:

      • Define what exact workload we expect to reproduce a connection storm, along with the expected symptoms of a connection storm. This should possibly look at previous HELP tickets and or customer issues, and may involve talking to TSEs. My best current understanding is that load balancers are intended to help with a scenario where the number of app servers rapidly scales up, with a minimum connection pool size set to some non-zero value. One motivating example might be the example detailed in this blog post.
      • Try to reproduce that scenario in the easiest way possible - at this point we do not care about getting the reproducer into our continuous integration suites. That will be done as follow-on work in SERVER-61693
      • Once we're able to reliably reproduced the connection storm behavior, run the exact same workload but with an L4 load balancer deployed - probably something like Elastic Load Balancer on AWS - and see if the issue goes away, and if not, document the behavior

      As we do this, we should document (either in the ticket or in a google doc) our progress and the steps we've taken to get there.

            Assignee:
            george.wangensteen@mongodb.com George Wangensteen
            Reporter:
            matthew.saltz@mongodb.com Matthew Saltz (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

              Created:
              Updated:
              Resolved: