Limit prune timestamp eviction eligibility check to pages under the memory page max threshold

XMLWordPrintableJSON

    • Type: Task
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Cache and Eviction
    • None
    • Storage Engines - Transactions
    • 671.024
    • SE Transactions - 2026-06-19
    • None

      Currently, the prune timestamp eligibility check happens inside __evict_review(), which is called from __wt_evict() — after the thread has already taken an exclusive lock on the page. This means the check occurs deep in the expensive eviction path. When the prune timestamp check fails, the thread has already locked the page, blocking other threads from accessing it. Those blocked threads then retry eviction (up to 10 times each), creating an eviction retry storm that wastes CPU cycles.

      It appears that moving the prune timestamp check earlier into __evict_force_check() (the cheap path) could prevent threads from unnecessarily entering the expensive eviction path — avoiding the exclusive lock, retries, and thread blocking.

      Additionally, I added a guard around the prune timestamp check in __evict_review() so that it only takes effect when the page's memory footprint is below the max memory page threshold. This ensures that oversized pages are not blocked from eviction by the timestamp check.

      Experiments with this approach have shown a ~68% reduction in standby lag.

      Note: This is an alternative to WT-17042. Both approaches yield a similar reduction in standby lag. However, with WT-17042's approach, cache health can become a concern — in Linkbench on a 32GB machine, pages could grow up to ~1.6GB (since the dirty eviction threshold is 5% of 32GB). In contrast, this approach achieves the same lag reduction while maintaining a healthy cache, where page sizes remain within the configured maximum memory page threshold of 10MB.

      Results (DSI linkbench workload — standby lag)
      Baseline: 330s
      https://spruce.corp.mongodb.com/version/69dfd9ff6fbb98000786b7aa/tasks?sorts=STATUS%3AASC%3BBASE_STATUS%3ADESC

      Option 1: Prune ts check in __evict_force_check() + existing prune ts check in __evict_review() → 228s (31% better)
      https://spruce.corp.mongodb.com/version/69dfdd684025820007281d45/tasks?sorts=STATUS%3AASC%3BBASE_STATUS%3ADESC
      Option 2: Prune ts check in __evict_force_check() + page-size-guarded prune ts check in __evict_review() → 96s (71% better)
      https://spruce.corp.mongodb.com/version/69dfdd090881960007439ad1/tasks?sorts=STATUS%3AASC%3BBASE_STATUS%3ADESC
      Option 3: Only page-size-guarded prune ts check in __evict_review() → 105s (68% better)
      https://spruce.corp.mongodb.com/version/69dfde58a991cc0007a17528/tasks?sorts=STATUS%3AASC%3BBASE_STATUS%3ADESC

            Assignee:
            Suganthi Mani
            Reporter:
            Suganthi Mani
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated: