[SERVER-40755] Expose statistics which indicate how many collection scans have executed Created: 20/Apr/19 Updated: 29/Oct/23 Resolved: 19/Jul/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Diagnostics, Querying |
| Affects Version/s: | None |
| Fix Version/s: | 4.3.1 |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Pawel Terlecki | Assignee: | Sam Mercier |
| Resolution: | Fixed | Votes: | 1 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||
| Sprint: | Query 2019-06-03, Query 2019-06-17, Query 2019-07-01, Query 2019-07-15, Query 2019-07-29 | ||||||||||||||||
| Participants: | |||||||||||||||||
| Description |
|
It is unclear how prevalent are data scans. Having statistics on collection and index scans would allow us to decided if improvements in this area are critical for the overall performance. |
| Comments |
| Comment by Githook User [ 12/Jul/19 ] | |||||||||
|
Author: {'name': 'samontea', 'username': 'samontea', 'email': 'merciers.merciers@gmail.com'}Message: | |||||||||
| Comment by Githook User [ 10/Jul/19 ] | |||||||||
|
Author: {'name': 'Xiangyu Yao', 'email': 'xiangyu.yao@mongodb.com', 'username': 'xy24'}Message: Revert " This reverts commit a4ef14ef41f0700ef07e5b57b0345d2396a44604. | |||||||||
| Comment by Githook User [ 10/Jul/19 ] | |||||||||
|
Author: {'name': 'samontea', 'email': 'merciers.merciers@gmail.com', 'username': 'samontea'}Message: | |||||||||
| Comment by Asya Kamsky [ 01/Jun/19 ] | |||||||||
|
LGTM | |||||||||
| Comment by David Storch [ 22/May/19 ] | |||||||||
|
bruce.lucas, the reasoning was two-fold:
That said, we could totally add a collection scan counter to serverStatus in addition to $collStats. I'd like to keep any changes around scanned and scannedObjects out of scope for this ticket, since those stats don't directly tell you whether there are collection scans happening or not. The "scanned objects" could be due to a large index scan which requires many documents to be fetched. | |||||||||
| Comment by Bruce Lucas (Inactive) [ 21/May/19 ] | |||||||||
|
david.storch, I'm wondering why we would have scanned and scannedObjects at the serverStatus level but collectionScans at the collection level. Would it make sense to have all three (scanned, scannedObjects, and collectionScans) both per-collection and globally? | |||||||||
| Comment by David Storch [ 21/May/19 ] | |||||||||
|
After discussing with pawel.terlecki, I propose adding a new option to $collStats called queryExecStats. This would cause a new section of statistics to be returned, also called queryExecStats. This document would contain a field called collectionScans, a per-collection 64 bit counter which is incremented whenever a collection scan plan is executed over that collection. It would look something like this:
bruce.lucas kelsey.schubert asya pawel.terlecki does this plan sound ok to you? If so please respond with an LGTM. In the meantime, I am moving this ticket back to our triage queue so it can be considered for scheduling. | |||||||||
| Comment by Bruce Lucas (Inactive) [ 23/Apr/19 ] | |||||||||
|
We already have in serverStatus metrics.queryExecutor.scanned and .scannedObjects which can answer the general question. Possibly this information could be added to collStats and indexStats if more detailed information would be useful, although to identify the source of the scans usually the queries are needed and we typically get that from mongod logs. | |||||||||
| Comment by Pawel Terlecki [ 21/Apr/19 ] | |||||||||
|
pasette, it is part of this ticket to figure out the best place and how much detail should be logged. If we want to collect metrics per collection/index, collStats and indexStats would be a good place, but serverStatus would still show some aggregated stats, like for other WT events. In addition to the internal use, this could be used for troubleshooting. Collections that are often scanned probably need some indexing and should be investigated further. | |||||||||
| Comment by Daniel Pasette (Inactive) [ 21/Apr/19 ] | |||||||||
|
Do you mean including them in serverStatus, collStats or indexStats? |