[SERVER-20217] Allow reporting of page faults and context switches during slow queries Created: 31/Aug/15  Updated: 05/Apr/17  Resolved: 28/Nov/16

Status: Closed
Project: Core Server
Component/s: Concurrency
Affects Version/s: 3.0.6, 3.1.7
Fix Version/s: None

Type: Improvement Priority: Minor - P4
Reporter: Geert Bosch Assignee: Mark Benvenuto
Resolution: Done Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Backwards Compatibility: Fully Compatible
Sprint: Platforms 2017-01-23
Participants:

 Description   

Directly keep track of the number of page faults and related info. The idea is to get the start values of these counters between operations (under control of a new run-time settable server parameter), and if there is a slow query, we do another read to compute the differences and report those.
The information would be obtained by using getrusage on Linux, and similar functions for other systems where available.



 Comments   
Comment by Mark Benvenuto [ 28/Nov/16 ]

With the work in SERVER-24572 and SERVER-24605, we log context switches, and page faults to FTDC. Unfortunately, there is no way to correlate the activity to a particular query. The OS only keeps aggregate statistics on a process or system-wide basis. At the best, correlation has be done by hand by examining the FTDC log for the time period in which the query ran with the understanding that FTDC metrics account for process and system-wide.

I am closing this as by design since we will not output these aggregate statistics in the slow query log output for the reasons I outlined above.

Comment by Oleg Rekutin [ 22/Sep/15 ]

Running on 3.0.6 (with a 3.0.7 back port for some spin lock fixes), haven't tried 3.1.8 or latest dev yet.

Comment by Martin Bligh [ 22/Sep/15 ]

oleg@evergage.com are you running on 3.1.8? That should be considerably better for consistent performance than before, we've been fixing a variety of issues.

Comment by Oleg Rekutin [ 21/Sep/15 ]

Would absolutely love to have more tools to debug slow queries! I've been seeing considerably more latency spikes and variability with WiredTiger vs MMAP, so anything to help figure out latency issues is great.

Generated at Thu Feb 08 03:53:33 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.