[SERVER-18637] Measure initial lock time in operation latency Created: 22/May/15  Updated: 14/Jun/17  Resolved: 14/Jun/17

Status: Closed
Project: Core Server
Component/s: Admin, Diagnostics
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Igor Canadi Assignee: David Storch
Resolution: Duplicate Votes: 2
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates SERVER-29304 Exclude time spent blocking for await... Closed
is duplicated by SERVER-20632 Query execute time not include some L... Closed
Related
Operating System: ALL
Sprint: Query 2017-06-19
Participants:

 Description   

Based on this thread, v3.0 should include initial locking time in total operation latency: https://groups.google.com/forum/#!searchin/mongodb-dev/mark$20callaghan%7Csort:date/mongodb-dev/l2fR4Xr-Bxo/ibYPp7BB4-cJ

However, this is not true. As an example, here is a log line of an operation that claims latency of 2ms, but it waited 1.5sec for a lock:

 query ... query: { .... } planSummary: IXSCAN { venueId: 1, timestamp: 1, activeCommand: 1 } ntoreturn:100 ntoskip:0 nscanned:31 nscannedObjects:31 keyUpdates:0 writeConflicts:0 numYields:1 nreturned:31 reslen:6475 locks:{ Global: { acquireCount: { r: 4 } }, Database: { acquireCount: { r: 2 }, acquireWaitCount: { r: 1 }, timeAcquiringMicros: { r: 1468984 } }, Collection: { acquireCount: { r: 2 } } } 2ms



 Comments   
Comment by David Storch [ 14/Jun/17 ]

This is fixed by SERVER-29304 as of commit a68f209e4ff. The behavior is also now tested by jstests/noPassthrough/latency_includes_lock_acquisition_time.js. I am therefore resolving this ticket as a duplicate.

Comment by Andy Schwerin [ 07/Mar/16 ]

This improvement would also make maxTimeMs much more useful, since then you're typically interested in a certain amount of wall clock time passing before you abandon an operation. Fixing it well would involve eliminating the concept of ensureStarted entirely, and just starting the clock on CurOp (or OperationContext) construction. I'm highly in favor of this approach. geert.bosch, how much work would it be to fix slow query accounting in an acceptable way?

Comment by Geert Bosch [ 31/Aug/15 ]

Hi Igor,

Apologies on the delay to getting back to you on this.

A problem with this approach is that simple point queries or single indexed updates/removes, that have to wait long to obtain their initial locks, would get flagged as "slow queries". This would make the slow query reporting not very useful. So, if we'd do this, we should also change the computation of the slow query time as the total elapsed time of the query, minus all time waiting to acquire locks.

With that, I think this change would be a welcome improvement.

Comment by Igor Canadi [ 22/May/15 ]

Extended the ensureStarted() to AutoGetCollectionForRead too: https://github.com/mongodb-partners/mongo/commit/f72ba710608f2ab0e7d5d52a62e7bb7912ca5380

Comment by Igor Canadi [ 22/May/15 ]

I think this patch should fix it: https://github.com/mongodb-partners/mongo/commit/9ab56ccf716bdbef05c2e588b1135179dc5253d3

Generated at Thu Feb 08 03:48:18 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.