[SERVER-29632] Expose time spent holding locks, excluding time spent waiting to acquire them Created: 14/Jun/17  Updated: 06/Dec/22

Status: Backlog
Project: Core Server
Component/s: Diagnostics
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: David Storch Assignee: Backlog - Query Execution
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to SERVER-33155 Export/report lock-held time statistics Closed
Assigned Teams:
Query Execution
Participants:

 Description   

Since version 3.0, the latencies reported in the logs and profiler have generally represented the entire operation latency, including any time spent acquiring locks. We separately report lock wait times in the locks section of the logs, system.profile, and serverStatus. However, often it is interesting to analyze the time an operation takes without including lock acquisition. For example, in the case of queues on a lock caused by a slow operation, this would help to identify which operations are actually taking a long time for the execution path to process, as opposed to those taking a long time because they need a lock which is currently held by another op.

If we decide to explicitly expose such a quantity, we should think through how to add it to the following debug mechanisms:

  • Slow query logs.
  • system.profile collection.
  • db.currentOp().
  • The top command.
  • Global and per-collection latency stats.
  • serverStatus().


 Comments   
Comment by James Wahlin [ 15/Jun/17 ]

The only thing I would clarify is that total execution time can include a significant portion where we are neither acquiring or holding a lock (for example when performing in-memory sort in aggregation). This ticket should look to make clear lock acquisition vs lock-held vs no-lock-held duration, though I think it would be reasonable for the last to be derived by subtracting the lock related durations from the total.

Comment by David Storch [ 14/Jun/17 ]

There is some additional context in this thread. SERVER-18637 also points out that legacy read operations continue to exclude lock acquisition time, which is an inconsistency that we plan to resolve for version 3.6.

charlie.swanson and james.wahlin, is there anything else you wish to add here.

Generated at Thu Feb 08 04:21:24 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.