[SERVER-56013] NetworkInterfaceExceededTimeLimit error could use some extra information Created: 10/Apr/21  Updated: 06/Dec/22

Status: Backlog
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Lamont Nelson Assignee: Backlog - Service Architecture
Resolution: Unresolved Votes: 0
Labels: sa-remove-fv-backlog-22
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
Assigned Teams:
Service Arch
Participants:
Linked BF Score: 0

 Description   

When we receive a NetworkInterfaceExceededTimeLimit error due to a connection timeout we are missing 1) the host we are connecting to, and 2) how long we actually waited to get a connection. Usually #1 is clear from the context in the logs, but #2 is often unclear. It would be nice to just explicitly provide this information in the error since timing is often crucial to diagnosing other issues.

Example:

[js_test:safe_secondary_reads_causal_consistency] 2021-01-18T14:28:15.239+0000 s22028| 2021-01-18T14:28:15.239+00:00 D1 -        23074   [conn12] "User assertion","attr":{"error":"NetworkInterfaceExceededTimeLimit: Couldn't get a connection within the time limit","file":"src/mongo/s/async_requests_sender.cpp","line":295



 Comments   
Comment by Lamont Nelson [ 20/Apr/21 ]

This is about the information in the log line itself. When debugging I often find myself having to figure out when a request was made to determine if the timeout error makes sense. It usually does, but it's just an extra step when trying to account for what the server is doing.

Generated at Thu Feb 08 05:38:04 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.