Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-6175

tcmalloc fragmentation is worse in 4.4 with durable history

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: 4.4.0-rc4
    • Fix Version/s: WT3.2.2, 4.5.1, 4.4.0-rc10
    • Component/s: None
    • Labels:
      None

      Description

      This isn't new with 4.4.0-rc4, it has been an issue in all of the 4.4 release candidates I tried. HELP-13660 has a possible explanation for the trigger: 1) modify many documents and then 2) do queries that require long-running scans.

      My test case is Linkbench with a large database. The workload is 1) load the database 2) create a secondary index on one of the collections and 3) run transactions. The problem happens at step 2 which does a scan during create index. The test database is ~200G with Snappy compression and WiredTiger has cacheSizeGB=40.

      I dump tcmalloc stats after each step. Much more detail is here and the summary is listed below.

      For 4.4.0-rc4, VSZ for the mongod process is ~9G larger after create index compared to VSZ for 4.2.6 or 4.4 prior to the durable history merge.

      This can be reproduced with Linkbench2 that is in DSI, although:
      1) that will have to be changed to create the secondary index after the load.
      2) I use maxid1=200M while the code in DSI now uses maxid1=10M

      I am not sure whether Henrik added a repro to DSI for this when he did the work leading to HELP-13660

        Attachments

        1. 3stacks.png
          3stacks.png
          155 kB
        2. comparison.png
          comparison.png
          177 kB
        3. fragmentation.png
          fragmentation.png
          156 kB
        4. growth.png
          growth.png
          245 kB
        5. hpe.426.tar
          44.26 MB
        6. linkbench-10G.png
          linkbench-10G.png
          529 kB
        7. metrics.2020-05-08T14-09-24Z-00000.r1
          9.93 MB
        8. metrics.2020-05-08T20-17-36Z-00000.r1
          10.00 MB
        9. metrics.2020-05-09T00-53-05Z-00000.r1
          621 kB
        10. metrics.interim
          190 kB
        11. metrics.interim.r1
          22 kB
        12. repro-32-5G.png
          repro-32-5G.png
          367 kB
        13. wt6175.lb200m.may14.tar
          48.80 MB

          Issue Links

            Activity

              People

              Assignee:
              michael.cahill Michael Cahill
              Reporter:
              mark.callaghan Mark Callaghan
              Votes:
              0 Vote for this issue
              Watchers:
              21 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: