Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-14308

Understand WT read behavior during in-cache 100% update YCSB workload

    • Type: Icon: Task Task
    • Resolution: Unresolved
    • Priority: Icon: Minor - P4 Minor - P4
    • None
    • Affects Version/s: None
    • Component/s: Cache and Eviction
    • None
    • Storage Engines
    • StorEng - Defined Pipeline

      In examining some YCSB workloads, I've observed that they sometimes have surprising amounts of IO.

      The following FTDC is from a recent (8.1) run of a 100% update YCSB workload on an in-cache data-set.

      The point of interest here is that the data set fits comfortably in the cache, as seen in the cache fill metrics. (I believe the total data size is ~5GB. The WT cache is configured to 15GB.) But despite that, the WT block manager is consistently issuing thousands of reads per second.

      In fact, we are issuing about one read for every two updates, which at first blush seems crazy.

      The goal of this ticket is to understand why we read so much data in this workload. What is WT reading? Why is it reading it? Should WT be doing these reads, or can we optimize them out?

      I've attached the full FTDC as ftdc.tgz.

        1. ftdc.tgz
          535 kB
        2. History window 1 vs 300.png
          History window 1 vs 300.png
          233 kB
        3. History Window 1 vs 300 sec.png
          History Window 1 vs 300 sec.png
          333 kB
        4. Sulabh.png
          Sulabh.png
          117 kB
        5. YCSB 100pct update.png
          YCSB 100pct update.png
          217 kB

            Assignee:
            keith.smith@mongodb.com Keith Smith
            Reporter:
            keith.smith@mongodb.com Keith Smith
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated: