Loading...

XML

Word

Printable

JSON

Type: Task
Resolution: Fixed
Priority: Major - P3
Fix Version/s: WT12.0.0, 9.0.0-rc0
Affects Version/s: None
Component/s: Block Cache
Labels:
None

Assigned Teams:

Storage Engines, Storage Engines - Persistence
Total Hours with Assigned Team:
754.532
Epic Link:
Use PALI block cache in production
Sprint:
SE Persistence backlog
Story Points:
None

Issue Summary

Add a block_cache_put_time_max connection statistic that records the maximum time spent adding a single page to the disaggregated victim cache (usecs). This complements the cumulative block_cache_put_time and the per-thread-role put counters added under ~~WT-17850~~: the cumulative time gives the average per-put cost, but only a maximum surfaces the worst-case latency an operation pays on the eviction/victim-cache critical path. This is the tail signal needed to decide whether application threads should be allowed to contribute to the victim cache.

This was prototyped in the ~~WT-17850~~ PR but pulled out, because how we want the maximum to behave over time still needs to be defined (see Open question below).

Context

The victim-cache put happens in __evict_page_victim_cache (src/evict/evict_page.c), where we already measure the cumulative put time. The maximum can follow the existing eviction maximum-latency pattern, e.g. evict_max_ms / eviction_maximum_milliseconds:

A running maximum is held in a wt_shared uint64_t field on WT_EVICT (src/evict/evict.h), e.g. evict_victim_cache_max_put_us.
It is updated at the put site with __wt_atomic_stats_max_uint64(&conn->evict->evict_victim_cache_max_put_us, elapsed), where elapsed is the WT_CLOCKDIFF_US already computed for block_cache_put_time.
The field is copied into the statistic in _wt_evict_stats_init (src/evict/evict_conn.c) via WT_STATP_CONN_SET(session, stats, block_cache_put_time_max, ...). That function runs on every connection-stats read (called from _wt_conn_stat_init), so the statistic stays live - this is exactly how eviction_maximum_milliseconds is wired.
The stat is declared in dist/stat_data.py as a BlockCacheStat and the derived code regenerated with dist/stat.py.

Open question - how the maximum behaves over time

A plain lifetime maximum (no_clear, like evict_max_ms) only ever ratchets up, which is of limited use in FTDC. We likely want a maximum over a collection period. Two patterns exist in the codebase:

Per-checkpoint reset - mirror evict_max_ms_per_checkpoint, which is reset to 0 at a period boundary (see src/checkpoint/checkpoint_txn.c). Cheap, but tied to checkpoint cadence rather than the FTDC sampling interval.
Clear-on-read - make the statistic clearable and zero the backing WT_EVICT field when stats are read with statistics=(clear). This matches the FTDC sampling interval but needs extra wiring, since the generated clear path only zeroes the stat array, not the WT_EVICT field.

Decide which semantics we want before implementing.

Proposed Solution

Add the evict_victim_cache_max_put_us field to WT_EVICT and update it at the put site in __evict_page_victim_cache.
Refresh block_cache_put_time_max from it in __wt_evict_stats_init.
Implement the chosen reset semantics (per-collection-period vs lifetime) from the open question above.

Definition of Done

block_cache_put_time_max is defined, populated, and validated by dist/s_all stat checks.
The reset behaviour is decided and implemented.
A short note on the chosen semantics is recorded on this ticket.

is related to

WT-17850 Investigate whether application threads should contribute to the disaggregated victim cache during eviction

Closed

split from

WT-17801 Identify statistics to track block cache contention

Closed

Assignee:: Etienne Petrel
Reporter:: Etienne Petrel
Votes:: 0 Vote for this issue
Watchers:: 1 Start watching this issue

Created:: Jun 17 2026 03:13:28 AM UTC
Updated:: Jun 21 2026 03:15:08 AM UTC
Resolved:: Jun 18 2026 01:46:44 AM UTC

Details

Description

Attachments

Issue Links

Activity

People

Dates