Loading...

XML

Word

Printable

JSON

Type: Task
Resolution: Unresolved
Priority: Minor - P4
Fix Version/s: None
Affects Version/s: None
Component/s: Cache and Eviction
Labels:
None

Assigned Teams:

Storage Engines, Storage Engines - Transactions
Sprint:
StorEng - Defined Pipeline
Story Points:
None

In examining some YCSB workloads, I've observed that they sometimes have surprising amounts of IO.

The following FTDC is from a recent (8.1) run of a 100% update YCSB workload on an in-cache data-set.

The point of interest here is that the data set fits comfortably in the cache, as seen in the cache fill metrics. (I believe the total data size is ~5GB. The WT cache is configured to 15GB.) But despite that, the WT block manager is consistently issuing thousands of reads per second.

In fact, we are issuing about one read for every two updates, which at first blush seems crazy.

The goal of this ticket is to understand why we read so much data in this workload. What is WT reading? Why is it reading it? Should WT be doing these reads, or can we optimize them out?

I've attached the full FTDC as ftdc.tgz.

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

YCSB 100pct update.png
Mar 21 2025 12:29:04 AM UTC
217 kB
Keith Smith
ftdc.tgz
Mar 21 2025 12:39:46 AM UTC
535 kB
Keith Smith
Sulabh.png
Mar 21 2025 09:47:07 PM UTC
117 kB
Keith Smith
History window 1 vs 300.png
Mar 21 2025 09:48:38 PM UTC
233 kB
Keith Smith
History Window 1 vs 300 sec.png
Mar 21 2025 09:51:53 PM UTC
333 kB
Keith Smith

Assignee:: Keith Smith
Reporter:: Keith Smith
Votes:: 0 Vote for this issue
Watchers:: 7 Start watching this issue

Created:: Mar 21 2025 12:28:46 AM UTC
Updated:: Apr 08 2025 01:20:58 AM UTC

Details

Description

Attachments

Attachments

Activity

People

Dates