Loading...

XML

Word

Printable

JSON

Type: Improvement
Resolution: Fixed
Priority: Major - P3
Fix Version/s: WT12.0.0, 9.0.0-rc0
Affects Version/s: None
Component/s: Cursors
Labels:
None

Assigned Teams:

Storage Engines - Foundations, Storage Engines - Transactions
Total Hours with Assigned Team:
281.664
Epic Link:
SPM-4736
Sprint:
None
Story Points:
None

Summary

In leader mode the ingest table is empty for reads and is not used for writes: all reads and writes route to the stable table. Despite this, every layered cursor operation still opens, resets and closes the ingest cursor, paying the session cursor-cache / reopen bookkeeping and the dhandle rwlock cost on each call. In disaggregated storage (DSC) those bookkeeping steps translate into extra page-service round trips and measurably hurt read latency.

This change skips opening the ingest cursor while the layered cursor is in leader mode, and adds NULL guards on the few code paths that previously assumed an ingest cursor always existed.

Change

In src/cursor/cur_layered.c:

__clayered_enter — only require stable_cursor for leader, only require ingest_cursor for follower when deciding whether to (re)open constituent cursors.
__clayered_open_cursors — early-return condition treats ingest as already-satisfied for leader; the ingest cursor is opened only when running as a follower.
_clayered_next / _clayered_prev cleanup — guard the ingest_cursor->reset call with a NULL check.
__clayered_largest_key — only call ingest_cursor->largest_key when the ingest cursor is open.
__clayered_next_random — return WT_NOTFOUND if the ingest fallback cursor is not open.

The follower path is unchanged.

Motivation / measurements

Validated on the YCSB in-cache 100% read workload from BF-43331 (DSC 11-node, 5-minute FTDC window, per-query normalized):

metric	baseline DSC	with this change	delta
paliRateLimiter admissions / query	1.50	1.20	-20%
block-manager bytes read / query	61.8	41.7	-33%
CPU user / query (µs)	37.8	36.2	-4%

Read latency on the affected sys-perf task improved from 835 to 801 (target 776); throughput improved from 153,220 to 159,522 ops/s (target 164,818). The optimization closes roughly half of the DSC-vs-ASC regression on its own.

Related to BF-43331 — [Atlas Infinite vs Atlas Core] Regression in YCSB in-cache and out-of-cache reads.

Assignee:: Shoufu Du
Reporter:: Shoufu Du
Votes:: 0 Vote for this issue
Watchers:: 3 Start watching this issue

Created:: Jun 01 2026 10:53:04 AM UTC
Updated:: Jun 17 2026 08:26:02 AM UTC
Resolved:: Jun 15 2026 11:22:39 AM UTC

Details

Description

Summary

Change

Motivation / measurements

Related

Attachments

Activity

People

Dates