Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-1947

Medium Multi LSM perf test degradation

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: WT2.7.0
    • Labels:
      None
    • # Replies:
      7
    • Last comment by Customer:
      true

      Description

      The changes for the bulk load has caused the medium-multi-lsm.wtperf workload to drop by about 40%. I am able to reproduce this on my AWS HDD box too. Using the changeset 1f2e107 from http://build.wiredtiger.com:8080/job/wiredtiger-perf-med-multi-lsm/951/ and comparing against the first changeset in http://build.wiredtiger.com:8080/job/wiredtiger-perf-med-multi-lsm/952/, c82ed17, I see the same drop in the read/update portion of the test. My first guess would be that the compact after populate did not fully compact down to one chunk in the newer changeset. Compact in the old version took 334 seconds, new changeset it took 6 seconds.

      Both runs of the test also exhibit drop outs in all throughput for 15-25 seconds at a time.

        Issue Links

          Activity

          Hide
          alexander.gorrod Alexander Gorrod added a comment -

          Bummer. I'd hoped that this change:
          https://github.com/wiredtiger/wiredtiger/pull/1971

          Would improve the throughput - I dug into this a bit, and noticed that we were creating a bloom filter on the one remaining chunk with the new code. Even though bloom_oldest wasn't configured.

          Show
          alexander.gorrod Alexander Gorrod added a comment - Bummer. I'd hoped that this change: https://github.com/wiredtiger/wiredtiger/pull/1971 Would improve the throughput - I dug into this a bit, and noticed that we were creating a bloom filter on the one remaining chunk with the new code. Even though bloom_oldest wasn't configured.
          Hide
          sue.loverso Sue LoVerso added a comment -

          Yeah that change fixed all the other drops in the charts except this one.

          Show
          sue.loverso Sue LoVerso added a comment - Yeah that change fixed all the other drops in the charts except this one.
          Hide
          alexander.gorrod Alexander Gorrod added a comment -

          I understand this one. A fallout of the change to support bulk loading into an LSM tree is that we can create a very large first chunk. That chunk is assigned as a generation 0 chunk for the merge algorithm. What that means is that it will be chosen for merges with smaller chunks - which is inefficient.

          The solution here is to figure out what a reasonable chunk generation is for a bulk loaded chunk. Doing that will require:

          • Tracking how much data is inserted during the bulk load - practical because a bulk load is single threaded by definition.
          • Applying a calculation to the volume of data inserted to figure out an appropriate generation to apply.

          My suggestion for the generation calculation is:

          bulk_load_size / (chunk_size * (merge_min + merge_max / 2))

          Show
          alexander.gorrod Alexander Gorrod added a comment - I understand this one. A fallout of the change to support bulk loading into an LSM tree is that we can create a very large first chunk. That chunk is assigned as a generation 0 chunk for the merge algorithm. What that means is that it will be chosen for merges with smaller chunks - which is inefficient. The solution here is to figure out what a reasonable chunk generation is for a bulk loaded chunk. Doing that will require: Tracking how much data is inserted during the bulk load - practical because a bulk load is single threaded by definition. Applying a calculation to the volume of data inserted to figure out an appropriate generation to apply. My suggestion for the generation calculation is: bulk_load_size / (chunk_size * (merge_min + merge_max / 2))
          Hide
          sue.loverso Sue LoVerso added a comment -

          Alex, we'll see what the plots show. My testing shows some recovery of the original performance with the fix. But it has varied a lot between runs. But there may be more to do here.

          Show
          sue.loverso Sue LoVerso added a comment - Alex, we'll see what the plots show. My testing shows some recovery of the original performance with the fix. But it has varied a lot between runs. But there may be more to do here.
          Hide
          alexander.gorrod Alexander Gorrod added a comment -

          https://github.com/wiredtiger/wiredtiger/pull/2010

          Regression resolved by above pull request.

          Show
          alexander.gorrod Alexander Gorrod added a comment - https://github.com/wiredtiger/wiredtiger/pull/2010 Regression resolved by above pull request.
          Hide
          xgen-internal-githook Githook User added a comment -

          Author:

          {u'username': u'agorrod', u'name': u'Alex Gorrod', u'email': u'alexg@wiredtiger.com'}

          Message: Calculate a merge generation for bulk loaded LSM chunks.

          Otherwise LSM tends to choose poor merges after a bulk load,
          which leads to degraded performance.

          Refs WT-1947
          Branch: develop
          https://github.com/wiredtiger/wiredtiger/commit/0b42eab5213d539b9d696a57a93778bde4323b32

          Show
          xgen-internal-githook Githook User added a comment - Author: {u'username': u'agorrod', u'name': u'Alex Gorrod', u'email': u'alexg@wiredtiger.com'} Message: Calculate a merge generation for bulk loaded LSM chunks. Otherwise LSM tends to choose poor merges after a bulk load, which leads to degraded performance. Refs WT-1947 Branch: develop https://github.com/wiredtiger/wiredtiger/commit/0b42eab5213d539b9d696a57a93778bde4323b32
          Hide
          xgen-internal-githook Githook User added a comment -

          Author:

          {u'username': u'sueloverso', u'name': u'Susan LoVerso', u'email': u'sue@wiredtiger.com'}

          Message: Minor code refactor per Michael's suggstion. WT-1947
          Branch: develop
          https://github.com/wiredtiger/wiredtiger/commit/74bce81f521a2e0f08bceadea2edbec977f2427c

          Show
          xgen-internal-githook Githook User added a comment - Author: {u'username': u'sueloverso', u'name': u'Susan LoVerso', u'email': u'sue@wiredtiger.com'} Message: Minor code refactor per Michael's suggstion. WT-1947 Branch: develop https://github.com/wiredtiger/wiredtiger/commit/74bce81f521a2e0f08bceadea2edbec977f2427c

            People

            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:
                Days since reply:
                2 years, 2 weeks, 2 days ago
                Date of 1st Reply: