-
Type:
Task
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
Storage Engines, Storage Engines - Persistence
-
SE Persistence - 2026-02-13
-
None
There is evidence in the YCSB 100 update (and likely 128 thread load) workloads that WiredTiger starts wasting work attempting to reclaim space from the cache while a checkpoint is being created.
The scenario runs like this:
- The cache is quite full of dirty content (about 80% of configured max)
- A checkpoint begins, which means any new content cannot be evicted
- The dirty cache utilization approaches or reaches the upper bound
- WiredTiger starts queueing pages for reconciliation it knows are not good candidates for eviction, because it is trying hard to reclaim space (operating in aggressive mode, or with eviction hard targets set).
- Those reconciliations consume CPU, and sometimes I/O (if there was any new content that can be written back), but do not reclaim meaningful space from the cache.
- The checkpoint becomes slower because it is competing for resources with reconciliations.
We should understand how much of the non-checkpoint reconciliation work happening during checkpoints is useful, and find heuristics for limiting wasted work. That will allow checkpoints to complete faster.
- depends on
-
WT-16433 Write Performance Reconciliation Efficiency - Metrics for Observability
-
- Open
-