[SERVER-33155] Export/report lock-held time statistics Created: 07/Feb/18 Updated: 21/Mar/18 Resolved: 21/Feb/18 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Storage |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | David Bartley | Assignee: | Kelsey Schubert |
| Resolution: | Won't Fix | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||
| Participants: | |||||||||||||
| Description |
|
serverStatus includes stats for lock acquisitions and deadlocks, but doesn't report lock held time. It would be good if it included these, since that information is useful for diagnosing problematic nodes. I can provide a patch that we've been running in production for several months if useful. |
| Comments |
| Comment by Kelsey Schubert [ 21/Feb/18 ] |
|
Hi bartle I'm closing this ticket in favor of Kind regards, |
| Comment by David Bartley [ 09/Feb/18 ] |
|
Since WiredTiger supports document-level locking, we typically find that lock held time is a pretty good proxy for operation time (lock acquiring time is usually negligible). If mongo wanted to support per-collection/per-db operation times, we'd collect those metrics, but I think we'd still opt to collect more detailed lock information as well, as it's always better to over-collect metrics I think it'd be fine to report per-collection information via collStats and dbStats, though we've been fine with having that information reported via serverStatus (it means that our metrics collector only needs to issue a single command, vs one per collection, which tends to be fairly slow). Since serverStatus already supports a mechanism to limit section output, one could imagine adding an extendedLocks section, that would be disabled by default? |
| Comment by Bruce Lucas (Inactive) [ 08/Feb/18 ] |
|
Hi David, Currently we have global operation count and operation latency metrics which give some related information. I think your request differs from this in two ways:
Thanks, |
| Comment by David Bartley [ 08/Feb/18 ] |
|
We've only found them useful in conjunction with https://jira.mongodb.org/browse/SERVER-33156; with that, there's a few ways we've seen this be useful: |
| Comment by Kelsey Schubert [ 08/Feb/18 ] |
|
Hi bartle, Thanks for the feature request; we'd be interested in reviewing your patch - would you be willing to open a pull request? Could you also speak to the types of issues you are using these the metrics to diagnose? I suspect that this information would be helpful to have, but additional context would help us as we consider how we can most effectively report these types of metrics. Please note that for us to consider a pull request, we would need you to sign the contributor agreement. Thanks again, |