-
Type:
Task
-
Resolution: Fixed
-
Priority:
Minor - P4
-
Affects Version/s: None
-
Component/s: Prefetch
-
None
-
Storage Engines - Persistence
-
258.309
-
SE Persistence - 2026-05-08
-
None
WiredTiger has a number of prefetch statistics that are redundant and make diagnostics harder to interpret. Specifically, prefetch_skipped ("pre-fetch not triggered by page read") is incremented alongside a more specific stat at every call site, making the aggregate counter carry no additional information.
Context
In src/session/session_prefetch.c, every early-return path increments both a reason-specific stat and the generic prefetch_skipped:
if (F_ISSET(session, WT_SESSION_INTERNAL)) {
WT_STAT_CONN_INCR(session, prefetch_skipped_internal_session);
WT_STAT_CONN_INCR(session, prefetch_skipped);
return (false);
}
if (F_ISSET(ref, WT_REF_FLAG_INTERNAL)) {
WT_STAT_CONN_INCR(session, prefetch_skipped_internal_page);
WT_STAT_CONN_INCR(session, prefetch_skipped);
return (false);
}
if (F_ISSET(S2BT(session), WT_BTREE_SPECIAL_FLAGS) &&
!F_ISSET(S2BT(session), WT_BTREE_VERIFY)) {
WT_STAT_CONN_INCR(session, prefetch_skipped_special_handle);
WT_STAT_CONN_INCR(session, prefetch_skipped);
return (false);
}
if (session->pf.prefetch_disk_read_count < 2) {
WT_STAT_CONN_INCR(session, prefetch_skipped_disk_read_count);
WT_STAT_CONN_INCR(session, prefetch_skipped);
return (false);
}
Because prefetch_skipped == sum of all per-reason skip stats, it provides no diagnostic value beyond what summing the specific counters would give. This pattern likely extends to other prefetch aggregate stats.
- Redundant stats make dashboards and diagnostic scripts harder to interpret.
- Operators and developers must cross-reference multiple counters to understand prefetch skip rates by cause.
- Removing or restructuring the duplicate counter would simplify analysis without loss of information.
Proposed Solution
- Audit all prefetch stats in src/stat/stat_data.py and their call sites to identify which stats are always co-incremented with a more specific companion.
- Determine whether prefetch_skipped (or similar aggregate stats) adds value as a roll-up, or whether it can be removed in favour of summing the per-reason counters in tooling/monitoring.
- Propose and implement a cleaner stat design: either remove the redundant aggregate, rename stats for clarity, or add documentation strings that make the stat relationships explicit.
- Update any downstream tooling or dashboards that rely on removed/renamed stats.
Definition of Done
- All redundant prefetch stats are either removed or documented with clear semantics.
- No stat is silently co-incremented with another without a clear reason.
- The change is reflected in stat_data.py and all relevant call sites.