[SERVER-63263] Add metric for connection establishment once MongoDB accepts() a new connection on a socket Created: 03/Feb/22  Updated: 29/Oct/23  Resolved: 20/Apr/22

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 6.1.0-rc0

Type: Improvement Priority: Major - P3
Reporter: George Wangensteen Assignee: Reo Kimura (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Documented
is documented by DOCS-15261 [SERVER] Investigate changes in SERVE... Closed
Backwards Compatibility: Fully Compatible
Sprint: Service Arch 2022-2-21, Service Arch 2022-03-07, Service Arch 2022-03-21, Service Arch 2022-04-04, Service Arch 2022-04-18, Service Arch 2022-05-02
Participants:
Story Points: 3

 Description   

We have often seen large delays in client perceived connection establishment latency, and don't have enough data to pin-down exactly where the delay is. While we often suspect delays may be happening in the TCP stack below mongoDB, or perhaps the network, we don't know how long MongoDB itself is taking on average to accept a new connection. 

As a first step should add a histogram that reveals how much time it takes for connections to be accept()ed on a socket by MongoDB's listener thread, until the connection is given its own dedicated thread and begins to run operations. 



 Comments   
Comment by Githook User [ 19/Apr/22 ]

Author:

{'name': 'Reo Kimura', 'email': 'reo.kimura@mongodb.com', 'username': 'rkimura21'}

Message: SERVER-63263 Add metric for connection establishment once MongoDB accepts() a new connection on a socket
Branch: master
https://github.com/mongodb/mongo/commit/ee1a5f6c6cffda99cfdc1abc9ec29c5daa2cfaa2

Comment by Reo Kimura (Inactive) [ 09/Mar/22 ]

bruce.lucas Yes, that is correct. I'll update this ticket with the relevant information as the code review progresses.

Comment by Bruce Lucas (Inactive) [ 09/Mar/22 ]

george.wangensteen, reo.kimura, can you please add a comment to this ticket summarizing the design that you are going with? From the code review I think you are adding only a single metric, cumulativeConnectionEstablishmentLatency, and not a histogram, is that correct?

Comment by Bruce Lucas (Inactive) [ 03/Feb/22 ]

I assume the intent is to record this in FTDC. If so, ideally the histogram would have a small number of buckets so it doesn't add a large FTDC burden.

Alternatively, you could add a single metric that records cumulative time for all connections between the two points you mention. This could then be used to compute the average time that it takes over any interval you want (delta time divided by delta connections), and we've generally found that histograms don't add a lot of diagnostic value above what you get with averages.

Generated at Thu Feb 08 05:57:19 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.