-
Type:
Task
-
Resolution: Won't Do
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: Block Cache
-
None
-
Storage Engines - Persistence
-
277.692
-
SE Persistence backlog
-
None
Issue Summary
The victim cache (disaggregated storage block cache) stores compressed page images on eviction via __wt_blkcache_compress, but that compression work is currently untimed. We have no visibility into how long victim-cache compression takes, which is needed to evaluate the "acceptable overhead" leg of the disagg victim-cache decision rule. This ticket adds a connection-level latency histogram, perf_hist_disaggcompress_latency_*, measured at the eviction call site.
Context
- On eviction, pages destined for the disagg victim cache are compressed at src/evict/evict_page.c:141 via __wt_blkcache_compress (defined in src/block_cache/block_io.c:664). The call is wrapped in WT_IGNORE_RET.
- The compress path already has DSRC counters (compress_write, compress_write_fail, compress_write_too_small, and a compression-ratio histogram), but no timing data.
- __wt_blkcache_compress is shared with the normal block-manager write path (src/block_cache/block_io.c:795), so timing must be done at the eviction caller to isolate the victim-cache cost rather than inside the shared function.
- Existing model to mirror: the disagg block-manager read/write latency histograms (perf_hist_disaggbmread_latency_ / perf_hist_disaggbmwrite_latency_), which are disagg-scoped and bucketed in microseconds — the right scale for per-block compression.
Proposed Solution
A microsecond latency histogram via the existing WT_STAT_USECS_HIST_INCR_FUNC machinery. This emits a _total_usecs accumulator plus 8 distribution buckets in one helper, giving both the mean (total_usecs / count) and the tail.
1. dist/stat_data.py — declare the histogram family
Add 9 PerfHistStat entries in the conn_stats list, alphabetically right after the perf_hist_disaggbmwrite_latency_* block. Bucket suffixes must match the macro at src/include/stat.h:356 exactly:
PerfHistStat\('perf\_hist\_disaggcompress\_latency\_lt100', 'disagg victim cache compress latency histogram \(bucket 1\) \- 50\-99us'\), PerfHistStat\('perf\_hist\_disaggcompress\_latency\_lt250', 'disagg victim cache compress latency histogram \(bucket 2\) \- 100\-249us'\), PerfHistStat\('perf\_hist\_disaggcompress\_latency\_lt500', 'disagg victim cache compress latency histogram \(bucket 3\) \- 250\-499us'\), PerfHistStat\('perf\_hist\_disaggcompress\_latency\_lt1000', 'disagg victim cache compress latency histogram \(bucket 4\) \- 500\-999us'\), PerfHistStat\('perf\_hist\_disaggcompress\_latency\_lt2500', 'disagg victim cache compress latency histogram \(bucket 5\) \- 1000\-2499us'\), PerfHistStat\('perf\_hist\_disaggcompress\_latency\_lt5000', 'disagg victim cache compress latency histogram \(bucket 6\) \- 2500\-4999us'\), PerfHistStat\('perf\_hist\_disaggcompress\_latency\_lt10000', 'disagg victim cache compress latency histogram \(bucket 7\) \- 5000\-9999us'\), PerfHistStat\('perf\_hist\_disaggcompress\_latency\_gt10000', 'disagg victim cache compress latency histogram \(bucket 8\) \- 10000us\+'\), PerfHistStat\('perf\_hist\_disaggcompress\_latency\_total\_usecs', 'disagg victim cache compress latency histogram total \(usecs\)'\),
2. src/include/block_inline.h — instantiate the helper
Add next to the existing disagg instantiations (currently lines 20-21):
WT\_STAT\_USECS\_HIST\_INCR\_FUNC\(disaggcompress, perf\_hist\_disaggcompress\_latency\)
This generates __wt_stat_usecs_hist_incr_disaggcompress(session, usecs), visible in evict_page.c.
3. src/evict/evict_page.c — bracket the call
Gate the clock with the standard pattern from the write path (src/block_cache/block_io.c:841) so __wt_clock is not paid when stats are disabled:
/\* Optionally compress the data before caching. \*/ timer = WT\_STAT\_ENABLED\(session\) && \!F\_ISSET\(session, WT\_SESSION\_INTERNAL\); time\_start = timer ? \_\_wt\_clock\(session\) : 0; WT\_IGNORE\_RET\( \_\_wt\_blkcache\_compress\(session, &buf\_orig, false, &compressed\_buf, NULL, &compressed\)\); if \(timer\) *wt\_stat\_usecs\_hist\_incr\_disaggcompress\( session, WT\_CLOCKDIFF\_US\(*wt\_clock\(session\), time\_start\)\); if \(compressed\_buf \!= NULL\) cache\_buf = compressed\_buf;
Plus declare uint64_t time_start; and bool timer; in the function's declaration block.
4. dist/s_all — regenerate
Run cd src/third_party/wiredtiger/dist && ./s_all to regenerate the derived files from stat_data.py (src/include/stat.h struct fields, src/support/stat.c clear/aggregate/init). These regenerated files are committed but must never be hand-edited.
Definition of Done
- cd dist && ./s_all exits clean (stat ordering + codegen validated).
- Build compiles
- A disagg workload with the victim cache enabled shows perf_hist_disaggcompress_latency_total_usecs advancing alongside compress_write in the statistics output.
Notes
- These are connection-level stats (the macro uses WT_STAT_CONN_INCR_), consistent with the other perf_hist__ families — distinct from the existing DSRC compress_write* counters.
- Measuring at the caller captures total per-block cost (pre_size + alloc + codec + memcpy), which is what the overhead-acceptability decision wants. If pure codec time across all callers is preferred instead, the alternative is bracketing src/block_cache/block_io.c:721.
- Touches the eviction/storage hot path — worth a sys-perf patch before merge.