-
Type:
Improvement
-
Resolution: Unresolved
-
Priority:
Minor - P4
-
None
-
Affects Version/s: None
-
Component/s: None
-
StorEng - Refinement Pipeline
-
None
-
(copied to CRM)
cc linda.qin@mongodb.com, mark.brinsmead@mongodb.com
This came up as a slack discussion, we could get a cache hit/miss estimate using the current statistics:
("pages requested from the cache" - "pages read into cache") / "pages requested from the cache"
But those statistics are across all the threads, internal as well as external. This means that internal operations like checkpoint, history store management, writing metadata, etc are also reflected in these statistics.
The application can get a better insight into a cache hit/miss ratio if we were to also compute "by application threads" version of these statistics, to get:
("pages requested from the cache by the application threads" - "pages read into cache by the application threads") / "pages requested from the cache by the application threads"
It was also pointed out by mark.brinsmead@mongodb.com that having these or similar statistics per query would be even more helpful. The logs could then report a means to calculate how effective a query was in utilising the cache. We already report bytes read/written in session level statistics, we could potentially report pages read / requested by application thread in session level statistics to get that information in the MongoDB logs.
Note:
I have not thought this through, there could be performance or other reasons to avoid having these statistics. So if and when we investigate it is worth having that discussion.