[SERVER-47446] Measure cpu time taken by operations Created: 09/Apr/20  Updated: 29/Oct/23  Resolved: 14/Oct/20

Status: Closed
Project: Core Server
Component/s: Internal Code
Affects Version/s: None
Fix Version/s: 4.9.0

Type: New Feature Priority: Critical - P2
Reporter: Mira Carey Assignee: Amirsaman Memaripour
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-31541 Measure CPU utilization for specific ... Closed
is related to SERVER-51601 Make OperationCPUTimer initializer mo... Closed
Backwards Compatibility: Fully Compatible
Sprint: Service arch 2020-10-19
Participants:
Case:

 Description   

It would be valuable to measure cpu time consumed by an operation, both as it's running and rolled up by user (to measure resource consumption by a particular user). We could potentially expose this via currentOp, via server status metrics or perhaps in a custom agg pipeline (if by user).

On linux pthread_getcpuclockid() offers a way to access cpu time consumed by a particular thread relatively easily. In turn, we could capture time at opCtx creation, then store the delta on each call to opCtx->checkForInterrupt, then flush the final value into the user level roll up at opCtx destruction.



 Comments   
Comment by Githook User [ 14/Oct/20 ]

Author:

{'name': 'Amirsaman Memaripour', 'email': 'amirsaman.memaripour@mongodb.com', 'username': 'samanca'}

Message: SERVER-47446 Measure cpu time taken by operations
Branch: master
https://github.com/mongodb/mongo/commit/1f0009a389042c24360509625d50a9e3812658c7

Comment by Bruce Lucas (Inactive) [ 14/Apr/20 ]

Yes to the first sentence. We have gotten better about accounting for wait time (e.g. we have storage wait time now from WT), but by no means all of it is accounted for, so this would help us know whether it was CPU time or some unaccounted-for wait time.

Regarding your second suggestion, sounds useful for currentOp, but maybe less so for slow query logging.

Comment by Mira Carey [ 13/Apr/20 ]

bruce.lucas,

I think it could be helpful to have this in logged slow operations.

To break out when operations are slow due to blocking vs because they're actually doing work? I wonder if a break down about cpu cycles used in the recent past might be a nice addition if that's a specific use case

Comment by Bruce Lucas (Inactive) [ 13/Apr/20 ]

I think it could be helpful to have this in logged slow operations.

Generated at Thu Feb 08 05:14:13 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.