Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-1259

medium-lsm compact regression

    • Type: Icon: Task Task
    • Resolution: Done
    • WT2.4.0
    • Affects Version/s: None
    • Component/s: None
    • Labels:

      This issue is a follow-on from WT-1200 and WT-1251. When wtperf calls compact and there is more than one merge thread, we don't fully merge all the chunks and that leads to many bloom filters and slower reads. With one merge thread, merging ends up with just one large chunk.

      I understand what is going on and why, but haven't yet figured out the best way to fix it.

      Here's one example of what we see in the bad case:
      When session::compact is called, we are writing into chunk 106. Compact enters the flushing phase and loops waiting until chunk 106 is flushed. While we're flushing earlier chunks (104-106), one of the merge threads begins a merge of chunks 96-100,102,103 into a new chunk 107. Once chunk 106 is flushed, the compact thread enters the compacting phase. The merge work unit then finds chunks 104-106 to merge into chunk 108. This finishes fairly quickly. Additional merge work units getting queued and processed set aggressive to 10 because COMPACTING is set. But they find chunk 108 and break the "look for chunks to merge" loop on chunk 103, which is still involved in the earlier merge. With only one chunk, 108, we don't meet the merge minimum and we return WT_NOTFOUND. The code in lsm_worker sees that and clear the WT_LSM_TREE_COMPACTING flag. Compact completes and there are lots of chunks.

      I have some ideas particularly looking at merge_progressing for the earlier merge still in progress so that compact doesn't think it is done too soon.

            Assignee:
            sue.loverso@mongodb.com Susan LoVerso
            Reporter:
            sue.loverso@mongodb.com Susan LoVerso
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: