Loading...

XML

Word

Printable

JSON

Type: Improvement
Resolution: Unresolved
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:
None

Assigned Teams:

Networking & Observability
Sprint:
N&O Prioritized List
Confidence Status:
None
Work Order:
3

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

The average latency metric timer (aka opLatency) currently starts too late and ends too early.

For example, on v6.0, starting the timers late in SEP::_initiateCommand suggests that the latency of the command path before this call and after we receive a network request is negligible. We see that this is not true in HELP-68909 (v6.0), where there is lock contention in areas (vivify mutex during SEP::_initiateCommand, ServiceContext mutex during opCtx creation) before this call. There may be more areas that can cause meaningful latency increase in other parts of the code outside of our timers.

I think it's worth investigating how these metrics can be extended to include more of the command processing path. This issue is loosely outlined in the "Addressing Server Networking Problems" document (observability section).

Note: the ServiceContext contention doesn't happen anymore (v8.0+) because of optimizations, though the vivify contention can still happen on v8.0+.

is related to

SERVER-101116 maxTimeMS deadline is set after potentially blocking work

Open

related to

SERVER-91491 Expand durationMillis of slow query log to cover command authorization and parsing

Closed

Assignee:: Unassigned
Reporter:: Alex Li
Participants:: Alex Li
Votes:: 2 Vote for this issue
Watchers:: 10 Start watching this issue

Created:: Jan 27 2025 04:58:45 PM UTC
Updated:: Mar 24 2025 08:49:37 PM UTC

Details

Description

Attachments

Issue Links

Activity

People

Dates