-
Type: Improvement
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
5
-
StorEng - Defined Pipeline
Summary
Add per-session statistics that would allow a user to compute I/O throughput.
Motivation
WiredTiger maintains a handful of statistics for each session. This includes
- WT_STAT_SESSION_BYTES_READ: This is the number of bytes read into the cache. I.e., the sum of the uncompressed sizes of the block read into the cache
- WT_STAT_SESSION_READ_TIME: This is the time spent reading data from the file system. I.e., the sum of the time for the file system I/O requests to read the compressed blocks.
There are comparable stats for writes.
Suggested Solution
It would be most useful to provide matching statistics for each of the above (and the write counterparts):
- WT_STAT_SESSION_FILE_BYTES_READ: This would measure the total size of data read from the file system. I.e., the sum of the sizes of the compressed blocks.
- WT_STAT_SESSION_CACHE_READ_TIME: This would be the time spent loading uncompressed data into the cache. I.e., it would measure the time from the start of the I/O request until the data has been received, decrypted, and decompressed, and sum that time for all data read by the session.
The names are just what I thought of off the top of my head.
With the addition of block caching in WT, we might want to think more widely about the right set of statistics to track the I/O done for each session.
For this to be useful in MongoDB, there will need to be some server work to collect and deliver this data (e.g., in the storage stats reported for slow queries).
This suggestion came from a discussion with geert.bosch@mongodb.com.