We currently maintain cache read/write statistics in four places:
- in the Btree code where we track the after-decompression bytes being read into the cache,
- in the Btree code where we track the after-compression bytes being written out of the cache,
- in the block manager code where we track the bytes read from disk,
- in the block manager code where we track the bytes written to disk.
Someone noticed the first two aren't each other's inverse, that is, reads count bytes coming into the cache, ignoring compression, writes count bytes leaving the cache, considering compression.
The obvious fix would be for writes to track bytes before compression happens. That's not possible because of raw compression: the reconciliation code is the last place we have a count of bytes to be written before compression takes place.
Stepping back, I don't think the bytes being read/written to/from the cache are interesting statistics, considered separately from the block manager's information. The block manager tells us what I/O looks like, all the cache manager information tells us is (maybe?) how effective compression was? (And if we cared about that, we could pretty easily track that information more exactly.)
I can't recall ever looking at the cache to/from numbers, where the block manager's numbers wouldn't have been as useful (but I'm certainly not the person around here with the most experience looking at these numbers).
Anyway, I'm inclined to remove the cache to/from statistics, what do you all think?
@agorrod, @michaelcahill, sueloverso