-
Type:
Task
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
Storage Engines
-
None
-
None
We have seen a number of workloads recently where an in-memory page is quite large and being repeatedly reconciled. Each reconciliation has additional modifications, but those modifications account for a relatively modest proportion of the total page size.
At the moment the page is fully reconciled, which results in a split (multiple on-disk pages). Every on-disk page is rewritten for each reconciliation.
A concrete example: If an in-memory page in an index has 256kb of content, it will be reconciled into 16 on-disk pages. That page is often retained in the cache as a single page. If there is just one change made to the page, and it's reconciled again, it will generate another 16 on-disk pages during that second reconciliation.
That is a waste of I/O bandwidth, we should do a better job in WiredTiger. Some things we could do are:
- If a page is reconciled and split multiple times, encourage it to be evicted. So that the in-memory representation reflects the split state of the tree. It isn't always possible to evict the page: we need exclusive access (i.e: not a checkpoint reconciliation), and for the in-memory page to adhere to the materialization frontier.
- Track the pages that are generated by a reconciliation, and don't rewrite a page that has identical content to one created in a prior reconciliation.
WT-11168has some work related to that, but we don't have a current mechanism for tracking "this page is the same". - When reconciling a page that was previously multi-split, guide the subsequent reconciliation to use the same split points where reasonable. To minimize the number of different pages generated.
- Add the ability to generate delta images that apply to the previously split (multi) reconciliation.
- Your good idea goes here.
Note that we are seeing this pattern when running the YCSB 100 update workload in DSC, as well as for low throughput worklods that slowly increase the page size of the right-most page in the oplog, but don't grow it large enough to force an eviction.
- is related to
-
WT-11168 Remove the page image reuse logic
-
- Closed
-