Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-63261

Add metrics for wait time for requests to acquire egress connections

    • Type: Icon: Improvement Improvement
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 6.1.0-rc0
    • Affects Version/s: None
    • Component/s: None
    • Labels:
      None
    • Fully Compatible
    • Service Arch 2022-2-21, Service Arch 2022-03-07, Service Arch 2022-03-21, Service Arch 2022-04-04, Service Arch 2022-04-18, Service Arch 2022-05-02, Service Arch 2022-05-16, Service Arch 2022-05-30, Service Arch 2022-06-13
    • 4

      On sharded clusters, client requests and system operations need to be routed from mongos to mongod, and occasionally from mongod to mongod. These requests need to acquire an outbound connection from the source host to the target host. This process is asynchronous, and we don't know how long it takes for requests to acquire a connection. As a result, it's difficult for us to determine when egress connection pooling might be a bottleneck for servicing requests, or what the exact impact on request latency is when the connection pool is under strain. 

       

      We should add metrics that answer the question: 'how much time does a request from (serverA) to (serverB) spend waiting to acquire a connection?' It might be best to use a histogram-based approach, in the style of SERVER-59858, where we maintain a histogram of wait-times for the last N connections/over the last X minutes. We could also 'rotate' the histograms, where we always keep one for the last (say) minute, and then have an 'aggregated' one of the last N minutes. It also would be ideal to collect the histograms on a per-targeted-host basis. 

      Generally, egress connections are acquired by requests  here for the NITL-based task executors. ScopedDBConnection, defined here, is also sometimes used to acquire connections, namely in the old "scanning" RSM, dbclient_rs, and a 1 or 2 sharded commands, but it may not be worth it to collect metrics for this outdated component.  

            Assignee:
            vojislav.stojkovic@mongodb.com Vojislav Stojkovic
            Reporter:
            george.wangensteen@mongodb.com George Wangensteen
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: