Batch per-update cache atomics in __wti_page_inmem_updates

XMLWordPrintableJSON

    • Type: Task
    • Resolution: Fixed
    • Priority: Major - P3
    • WT12.0.0
    • Affects Version/s: None
    • Component/s: Cache and Eviction
    • None
    • Storage Engines - Transactions
    • 120.232
    • None
    • None
    • v8.3, v8.0, v7.0

      Summary

      __wt_cache_page_inmem_incr uses significant CPU in workloads with high page-in/restoration rates. The function performs 9-12 relaxed atomic operations on shared cache-line counters per WT_UPDATE, and is called once per tombstone or prepared-update instantiated during page-in. Batching these atomics into a single bulk call recovers this CPU.

      Motivation

      _wti_page_inmem_updates instantiates in-memory WT_UPDATE structures for every tombstone and prepared update on a page when it is read into cache. Each instantiation calls wt_row_modifywt_update_serialwt_cache_page_inmem_incr, which performs 9-12 _wt_atomic_add_uint64_relaxed operations on shared WT_CACHE and WT_BTREE counters per update. On workloads with high page turnover (many evictions and re-reads), this creates extreme call volume:

      • A page with 100 tombstones triggers 1,000–1,200 relaxed atomics on shared cache lines
      • The atomics cause cache-line bouncing between cores on saturated systems
      • The page is exclusively owned during __wti_page_inmem_updates, making per-update atomics unnecessary — no other thread can observe the intermediate counter values

      Suggested Solution

      Add a session flag (WT_SESSION_SKIP_CACHE_INCR) that suppresses _wt_cache_page_inmem_incr inside the serialization functions. Set the flag at the start of wti_page_inmem_updates, clear it at the end, and perform a single bulk _wt_cache_page_inmem_incr(session, page, total_size, false) call after the loop. This reduces N×12 atomics to 1×6.

      Three serialization functions must respect the flag: _wt_update_serial, wt_insert_serial, and wt_col_append_serial. The column-store insert/append paths are reached from _wti_page_inmem_updates for prepared records, so all three must be gated to avoid double-counting.

      The error path must also perform the bulk increment for the total_size accumulated so far, to avoid leaving already-linked updates unaccounted.

      Relevant Links

      • Related: WT-14340 Make conn->flags atomic — similar pattern of shared-counter atomics as CPU bottleneck

            Assignee:
            Daniel Hill
            Reporter:
            Daniel Hill
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: