Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-16665

WiredTiger b-tree uses too much space in journal directory

    • Type: Icon: Improvement Improvement
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 2.8.0-rc3
    • Component/s: Storage, WiredTiger
    • Storage Execution
    • Fully Compatible

      I am running the iibench for MongoDB on two hosts. Both use the WT b-tree, one with snappy and the other with zlib compression. The test uses 10 threads to load 200M documents per thread. Immediately after the load is done the zlib test uses ~500G versus ~330G for snappy. With some idle time the zlib disk usage drops to ~160G. At first I thought the problem was only for zlib, but watching the size of data/journal during tests shows problems for snappy and zlib.

      My test host has 40 hyperthread cores, 144G RAM and fast (PCIe) flash

      Can WiredTiger delete log files sooner?

      I have a test in progress and it is about 30% done. The problem for the zlib configuration is too much space used in the journal directory. Note that "data" is the root for database files:

      du -hs data data/journal; ls data/journal | wc -l
      163G    data
      131G    data/journal
      1337
      

      And this is from the snappy test at about 20% done:

      du -hs data data/journal; ls data/journal | wc -l
      69G     data
      23G     data/journal
      231
      

      And this is from later in the snappy test:

      du -hs data data/journal; ls data/journal | wc -l
      120G    data
      47G     data/journal
      475
      
      ls -lh data/journal/
      total 25G
      -rw-r--r-- 1 root root 100M Dec 24 07:58 WiredTigerLog.0000001012
      -rw-r--r-- 1 root root 100M Dec 24 07:58 WiredTigerLog.0000001013
      -rw-r--r-- 1 root root 100M Dec 24 07:58 WiredTigerLog.0000001014
      <snip>
      

      And later in the zlib test:

      du -hs data data/journal; ls data/journal | wc -l
      231G    data
      182G    data/journal
      1862
      

        1. es.022553
          13 kB
          Mark Callaghan
        2. metrics.2016-05-31T03-47-19Z-00000.gz
          2.34 MB
          Mark Callaghan

            Assignee:
            backlog-server-execution [DO NOT USE] Backlog - Storage Execution Team
            Reporter:
            mdcallag Mark Callaghan
            Votes:
            2 Vote for this issue
            Watchers:
            26 Start watching this issue

              Created:
              Updated:
              Resolved: