Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-4637

Slow read performance after update workload with LSM tree

    • Type: Icon: Improvement Improvement
    • Resolution: Won't Fix
    • Priority: Icon: Minor - P4 Minor - P4
    • None
    • Affects Version/s: None
    • Component/s: None
    • Labels:
      None
    • 0

      A WiredTiger user reported:

      My name is Antonis Papaioannou and I'm a PhD candidate in Computer Science Department at University of Crete, Greece.

      I try to demostrate the strenghts and weaknessess of LSM vs Btree using WiredTiger (similar to your work [1]).

      I use WiredTiger 3.1.1 and wtperf tool to simulate workloads.

      During this processes I came up to a strange behaviour when using WiredTiger LSM tree implementation. More specifically, while WiredTiger LSM exhibits comparable read performance at the beginning, it exhibits very low performance when the db size increases after many update opearations. It is strange as based on our monitoring data it seems there is no evident bottleneck (e.g. I/O or CPU subsystem).

      In more details we populate the db with 250 million records (resulting to 40GB of data on disk) and then run a read-only workload multiple times. Our system achieves throughput up to 21000 ops/sec.

      Next we run an update-only workload in order to increase the on disk footprint of db (resulting to 102 GB of data) and then run the same read-only workload again. This time our system achieves only 500 ops/sec while it seems theres is no stress on the system resources (e.g. I/O or CPU)

      I attach 2 pdf files containing graphs of representative runs for each experiment.
      Each pdf presents the benchmark results (th/put and latency), cpu monitoring, io monitoring as well as some of the WiredTiger reported metrics (we parse the WiredTigerStat file).

      It is interesting that in the "bad" runs, the LSM performs many "LSM tree maintenance operations”. I think that we cannot blame these maintenance operations as It seems that these operations don’t stress the cpu or IO subsystem.

      I also attach the wtperf configuration files.

      It would be very helpful for us if you could provide feedback regarding our methodology and our evaluation results.

      The behavior does sound unexpected - we should investigate.

        1. bad_run.pdf
          1.02 MB
        2. lsm-populate.wtperf
          0.6 kB
        3. lsm-read.wtperf
          0.6 kB
        4. lsm-update.wtperf
          0.5 kB
        5. normal_run.pdf
          969 kB
        6. optracking-t2.png
          optracking-t2.png
          130 kB
        7. optrack-t2-internal-thread.png
          optrack-t2-internal-thread.png
          23 kB
        8. WiredTigerStat.bad.run
          26.56 MB
        9. WiredTigerStat.good.run
          26.64 MB

            Assignee:
            sasha.fedorova@mongodb.com Sasha Fedorova
            Reporter:
            alexander.gorrod@mongodb.com Alexander Gorrod
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated:
              Resolved: