-
Type:
Task
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: Cache and Eviction
-
None
-
Storage Engines - Transactions
-
671.024
-
SE Transactions - 2026-06-19
-
None
Currently, the prune timestamp eligibility check happens inside __evict_review(), which is called from __wt_evict() — after the thread has already taken an exclusive lock on the page. This means the check occurs deep in the expensive eviction path. When the prune timestamp check fails, the thread has already locked the page, blocking other threads from accessing it. Those blocked threads then retry eviction (up to 10 times each), creating an eviction retry storm that wastes CPU cycles.
It appears that moving the prune timestamp check earlier into __evict_force_check() (the cheap path) could prevent threads from unnecessarily entering the expensive eviction path — avoiding the exclusive lock, retries, and thread blocking.
Additionally, I added a guard around the prune timestamp check in __evict_review() so that it only takes effect when the page's memory footprint is below the max memory page threshold. This ensures that oversized pages are not blocked from eviction by the timestamp check.
Experiments with this approach have shown a ~68% reduction in standby lag.
Note: This is an alternative to WT-17042. Both approaches yield a similar reduction in standby lag. However, with WT-17042's approach, cache health can become a concern — in Linkbench on a 32GB machine, pages could grow up to ~1.6GB (since the dirty eviction threshold is 5% of 32GB). In contrast, this approach achieves the same lag reduction while maintaining a healthy cache, where page sizes remain within the configured maximum memory page threshold of 10MB.
Results (DSI linkbench workload — standby lag)
Baseline: 330s
https://spruce.corp.mongodb.com/version/69dfd9ff6fbb98000786b7aa/tasks?sorts=STATUS%3AASC%3BBASE_STATUS%3ADESC
Option 1: Prune ts check in __evict_force_check() + existing prune ts check in __evict_review() → 228s (31% better)
https://spruce.corp.mongodb.com/version/69dfdd684025820007281d45/tasks?sorts=STATUS%3AASC%3BBASE_STATUS%3ADESC
Option 2: Prune ts check in __evict_force_check() + page-size-guarded prune ts check in __evict_review() → 96s (71% better)
https://spruce.corp.mongodb.com/version/69dfdd090881960007439ad1/tasks?sorts=STATUS%3AASC%3BBASE_STATUS%3ADESC
Option 3: Only page-size-guarded prune ts check in __evict_review() → 105s (68% better)
https://spruce.corp.mongodb.com/version/69dfde58a991cc0007a17528/tasks?sorts=STATUS%3AASC%3BBASE_STATUS%3ADESC
- is related to
-
WT-17042 Force eviction of ingest tree pages can stall write (oplog) threads
-
- Closed
-