The page-memory adjustments during tree deepening splits are wrong.
The underlying problems are it's difficult to be sure we're correctly tracking the memory we're moving to/from parent/child pages during splits (see 6499e94, 5e2d0ed and df0cb0d), and second, to be sure we're consistently applying the memory overhead calculation between the original allocation and the split (for example, the page-creation code doesn't apply the memory overhead calculation to each WT_REF individually when allocating WT_REF structures, but the page-split code does apply the memory overhead calculation when moving WT_REF structures between pages).
The current workaround in the tree is to set the page's memory footprint value explicitly after we deepen the tree during a split.
I suggest:
- Remove the notion of memory overhead calculation on a per-allocation basis. Replace it with a configurable overhead percentage that adjusts the cache-size values.
I think this is better for a few reasons:
- a global percentage is potentially as accurate as the per-allocation approach, both approaches are equally vulnerable to workloads making them incorrect. For example, we don't apply the memory overhead calculation to memory allocated during reconciliation, so a reconciliation-heavy workload is going apparently use the same amount of memory as a read-only workload, which isn't correct;
- a configurable percentage allows applications to correct when they use a different underlying memory allocation engine or their workload has a fundamentally different memory load than what we measured when we chose a per-allocation constant;
- a global percentage means we don't have to solve our current problem where we are inconsistently applying the memory overhead constant;
- a global percentage is easier to implement and faster
- Write a function to calculate the current WT_PAGE.memory_footprint value and check the value for correctness when pages are discarded (or maybe during reconcilation), to confirm we're tracking page memory adjustments during splits.
- We could immediately change the current page split code to call the function that calculates the current value of the page's memory footprint, instead of just setting the value. When we trust the memory tracking code to consistently get the page's memory footprint correct, we could turn that code back on in the split code.
@michaelcahill, @agorrod, thoughts?