Priority: Major - P3
Resolution: Won't Do
Affects Version/s: None
Fix Version/s: None
Sprint:Storage Engines 2019-11-04
We have seen workloads that:
- Do many updates to one or a few large values.
- Don't allow the oldest transaction/ID or timestamp to move forward.
Optimizing in such cases is restricted by how WiredTiger manages history - it must be held in cache - we don't version values that are written to data files beyond checkpoints.
WiredTiger allows more history than fits into cache by using the lookaside (cache overflow) mechanism. It is currently possible to have a page in several different states:
- On disk (no content in cache).
- In memory (all history in cache).
- In "limbo" there is a single version of data in memory, but other history is in the lookaside file.
Any time an update is made to an entry on a page, the full history needs to be instantiated. In cases where there are hot records the in-memory page image can be greater than the memory_page_max - which is a trigger for WiredTiger forcibly evicting the page.
We should reproduce a workload that pathologically reads and re-writes pages to lookaside and investigate how to make them more efficient.