Prefer evicting update pages over clean pages under combined cache pressure

XMLWordPrintableJSON

    • Type: Improvement
    • Resolution: Unresolved
    • Priority: Minor - P4
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • Storage Engines - Transactions
    • 185.204
    • None
    • None

      Issue Summary

      When eviction has both clean (WT_EVICT_CACHE_CLEAN) and updates (WT_EVICT_CACHE_UPDATES) targets active, this proposes preferring pages that carry updates over clean pages: reconciling an update page frees cache bytes and reduces tracked update bytes at once, while evicting a clean page only frees bytes. The hypothesis is that this relieves update-byte pressure faster and heads off application-thread update eviction.

      Outcome: investigated and measured net-negative as a general policy. Recording here for reference so the approach is not re-tried blindly.

      Context

      • Related to WT-15538 (slow eviction when updates ratio is high). That umbrella describes large caches stuck above eviction_updates_trigger with update content the walker struggles to evict.
      • Trigger condition (both CLEAN and UPDATES set) is effectively the steady state under cache pressure, so a naive "prefer updates" rule behaves like a blanket "stop evicting clean pages" policy.

      Change Implemented

      • In __evict_try_queue_page() (src/evict/evict_walk.c), after a page passes the should_evict_page check, defer a page that qualifies only as a clean target so the queue slots and the per-tree walk budget go to update-carrying pages:
        if (evict_clean && !evict_updates && F_ISSET(evict, WT_EVICT_CACHE_UPDATES) &&
          !F_ISSET(evict, WT_EVICT_CACHE_CLEAN_HARD)) {
            WT_STAT_CONN_INCR(session, eviction_server_skip_clean_pages_prefer_updates);
            return;
        }
        
      • Safety valve: the WT_EVICT_CACHE_CLEAN_HARD guard re-enables clean-page queuing once fill is critical, bounding total cache size if update pages are scarce or cannot be evicted.
      • Added stat eviction_server_skip_clean_pages_prefer_updates to count deferrals.

      Experiment & Results

      Mechanism confirmed active (FTDC, linkbench2)

      • Skip stat accumulated ~72M deferrals over the test phase, so combined CLEAN+UPDATES pressure is constant for this workload.
      • pages selected for eviction unable to be evicted stayed at ~7/s, so deferring clean pages did not cause failed-eviction thrash.
      • Cache fill held steady (~16.9 GB), i.e. the CLEAN_HARD valve held.

      perf-required (perf-3-node-replSet.arm.aws.2024-05, 26 workloads)

      Consistent pattern across the suite: reads improve, writes/loads regress, often severely.

      Significant regressions (write/load-heavy):

      Workload Measurement Delta z
      ycsb.out_of_cache.95read5update load ops/sec -27.3% -11.5
      ycsb.load_0128thread load ops/sec -31.9% -10.7
      ycsb-update_force_stepdowns load ops/sec -25.8% -9.3
      ycsb.out_of_cache.100read load ops/sec -26.7% -9.0
      mixed_workloads Update/Delete/Insert p50 +10..11% 5.2..5.7
      tpce_locust trade_update avg latency +48.7% 4.6
      ycsb.in_cache.95read5update update latency +47.5% 2.8

      Significant improvements (read-heavy):

      Workload Measurement Delta z
      ycsb.in_cache.95read5update read latency -20.8% -4.5
      linkbench2 reads (10 metrics) -5..-8% up to -2.3
      mixed_workloads FindOne p50 -11% -2.8
      ecommerce_locust 7 metrics improved

      Per-task significant tally: linkbench2 10 up / 0 down, ecommerce 7 up, but mixed_workloads 1 up / 14 down, the ycsb load family heavily down, tpce/tsbs down.

      Analysis / Verdict

      • The mechanism does exactly what it was designed to: retaining clean pages improves read latency, but refusing to evict them forces eviction to keep reconciling still-hot update/dirty pages, which wrecks insert/update throughput.
      • The ycsb load regressions (-27% to -32% ops/sec, z up to 15) are showstoppers, not noise.
      • Even in update-pressured workloads it targets (ycsb 95read5update), it made update latency worse (+47%) while improving reads, so it does not relieve the WT-15538 update-stuck problem; it shifts cost onto the write path.

      Recommendation: do not pursue as-is. If revisited for WT-15538, the lever must be gated much more narrowly (only when the cache is genuinely stuck above eviction_updates_trigger with application threads spinning), not on the common combined-pressure case.

      Reproduction Artifacts

      • WiredTiger branch: wt-15538-prefer-update-pages off develop (commit a75669ed92).
      • mongo branch: wt-15538-prefer-update-pages off master (commit 82da7ae22d0); change applied in place to vendored WT and stats regenerated with dist/stat.py.
      • Targeted linkbench2 patch: 6a1a4d6fb5175000070bb9b9 ; comparison 6a1a80125df74c74026a7552.
      • perf-required patch: 6a1a8184263cb60007606303 ; comparison 6a1a92a56838c5d8f0224a95.

      Definition of Done

      • This experiment is recorded with data (done). Any future revisit should start from the narrowly-gated direction above rather than the blanket policy.

            Assignee:
            [DO NOT USE] Backlog - Storage Engines Team
            Reporter:
            Haribabu Kommi
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated: