Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-44752

Big CPU performance regression between 4.0.13 and 4.2.1 (on deletion it seems), about 16 times

    • Type: Icon: Bug Bug
    • Resolution: Duplicate
    • Priority: Icon: Major - P3 Major - P3
    • 4.2.3
    • Affects Version/s: 4.2.1
    • Component/s: WiredTiger
    • Labels:
      None
    • ALL

      Hi, it's me again with another big performance regression
      I recently tried upgrading to 4.2 to try the IO Watchdog and unfortunately found another big performance regression as I usually do when upgrading. With my same cron which allowed me to detect SERVER-35958, running at 11am UTC and doing a lot of insert and delete in one collection (roughly same number every day), I noticed this mongo CPU load difference:

      4.0.13 4.2.1
      Oct 17th 12pm UTC+1 Oct 19th 12pm UTC+1

      Mind the scale here, the average CPU usage went from ~12% to ~15% (not a big deal as I noticed a decreased in disk IO write so I don't consider this a regression) But the spike in the middle during this cron, went from from +10% to +40% and the duration went from 30 minutes to ~2h, which means 16 times more CPU time generated by this script !

      By looking at the log of the mongo requests this cron makes looking for the reason of the time increase I noticed there were no more calls than usual but the deletes were taking much longer, about 2s with 4.2.1 instead of 0.1s with 4.0.13, also they are getting slower and slower it seems. The script is doing roughly 300k inserts one after the other, interleaved with 5k deletes, all in the same collection. These are big delete though, containing up to 1000 IDs each.

      > zcat cron-4.2.log.gz | fgrep delete | wc -l
      5522
      > zcat cron-4.2.log.gz | fgrep insert | wc -l
      282192
      > zcat cron-4.0.log.gz | fgrep delete | wc -l
      5530
      > zcat cron-4.0.log.gz | fgrep insert | wc -l
      282860
      
      > zcat cron-4.0.log.gz | fgrep delete | fgrep SUCCEEDED | head
      D, [2019-11-17T11:05:09.753261 #2934] DEBUG -- : MONGODB | db2.updn.io:27017 | updown.delete | SUCCEEDED | 0.073s
      D, [2019-11-17T11:05:12.168360 #2934] DEBUG -- : MONGODB | db2.updn.io:27017 | updown.delete | SUCCEEDED | 0.074s
      D, [2019-11-17T11:05:14.504066 #2934] DEBUG -- : MONGODB | db2.updn.io:27017 | updown.delete | SUCCEEDED | 0.083s
      D, [2019-11-17T11:05:16.813694 #2934] DEBUG -- : MONGODB | db2.updn.io:27017 | updown.delete | SUCCEEDED | 0.065s
      D, [2019-11-17T11:05:19.014325 #2934] DEBUG -- : MONGODB | db2.updn.io:27017 | updown.delete | SUCCEEDED | 0.072s
      D, [2019-11-17T11:05:21.253102 #2934] DEBUG -- : MONGODB | db2.updn.io:27017 | updown.delete | SUCCEEDED | 0.077s
      D, [2019-11-17T11:05:23.437944 #2934] DEBUG -- : MONGODB | db2.updn.io:27017 | updown.delete | SUCCEEDED | 0.074s
      D, [2019-11-17T11:05:25.730171 #2934] DEBUG -- : MONGODB | db2.updn.io:27017 | updown.delete | SUCCEEDED | 0.088s
      D, [2019-11-17T11:05:27.930548 #2934] DEBUG -- : MONGODB | db2.updn.io:27017 | updown.delete | SUCCEEDED | 0.065s
      D, [2019-11-17T11:05:30.302693 #2934] DEBUG -- : MONGODB | db2.updn.io:27017 | updown.delete | SUCCEEDED | 0.069s
      
      > zcat cron-4.0.log.gz | fgrep delete | fgrep SUCCEEDED | tail
      D, [2019-11-17T11:30:31.186904 #2934] DEBUG -- : MONGODB | db2.updn.io:27017 | updown.delete | SUCCEEDED | 0.089s
      D, [2019-11-17T11:30:31.592353 #2934] DEBUG -- : MONGODB | db2.updn.io:27017 | updown.delete | SUCCEEDED | 0.113s
      D, [2019-11-17T11:30:31.999389 #2934] DEBUG -- : MONGODB | db2.updn.io:27017 | updown.delete | SUCCEEDED | 0.090s
      D, [2019-11-17T11:30:32.364738 #2934] DEBUG -- : MONGODB | db2.updn.io:27017 | updown.delete | SUCCEEDED | 0.104s
      D, [2019-11-17T11:30:32.703364 #2934] DEBUG -- : MONGODB | db2.updn.io:27017 | updown.delete | SUCCEEDED | 0.103s
      D, [2019-11-17T11:30:33.106916 #2934] DEBUG -- : MONGODB | db2.updn.io:27017 | updown.delete | SUCCEEDED | 0.162s
      D, [2019-11-17T11:30:33.475759 #2934] DEBUG -- : MONGODB | db2.updn.io:27017 | updown.delete | SUCCEEDED | 0.096s
      D, [2019-11-17T11:30:33.783008 #2934] DEBUG -- : MONGODB | db2.updn.io:27017 | updown.delete | SUCCEEDED | 0.102s
      D, [2019-11-17T11:30:34.084768 #2934] DEBUG -- : MONGODB | db2.updn.io:27017 | updown.delete | SUCCEEDED | 0.094s
      D, [2019-11-17T11:35:49.991567 #2934] DEBUG -- : MONGODB | db2.updn.io:27017 | updown.delete | SUCCEEDED | 0.145s
      
      > zcat cron-4.2.log.gz | fgrep delete | fgrep SUCCEEDED | head
      D, [2019-11-19T11:05:10.667866 #17303] DEBUG -- : MONGODB | db2.updn.io:27017 | updown.delete | SUCCEEDED | 0.121s
      D, [2019-11-19T11:05:13.187504 #17303] DEBUG -- : MONGODB | db2.updn.io:27017 | updown.delete | SUCCEEDED | 0.087s
      D, [2019-11-19T11:05:16.095652 #17303] DEBUG -- : MONGODB | db2.updn.io:27017 | updown.delete | SUCCEEDED | 0.121s
      D, [2019-11-19T11:05:18.996173 #17303] DEBUG -- : MONGODB | db2.updn.io:27017 | updown.delete | SUCCEEDED | 0.153s
      D, [2019-11-19T11:05:21.984816 #17303] DEBUG -- : MONGODB | db2.updn.io:27017 | updown.delete | SUCCEEDED | 0.109s
      D, [2019-11-19T11:05:25.213349 #17303] DEBUG -- : MONGODB | db2.updn.io:27017 | updown.delete | SUCCEEDED | 0.121s
      D, [2019-11-19T11:05:28.302843 #17303] DEBUG -- : MONGODB | db2.updn.io:27017 | updown.delete | SUCCEEDED | 0.122s
      D, [2019-11-19T11:05:31.374013 #17303] DEBUG -- : MONGODB | db2.updn.io:27017 | updown.delete | SUCCEEDED | 0.075s
      D, [2019-11-19T11:05:33.750134 #17303] DEBUG -- : MONGODB | db2.updn.io:27017 | updown.delete | SUCCEEDED | 0.094s
      D, [2019-11-19T11:05:36.268803 #17303] DEBUG -- : MONGODB | db2.updn.io:27017 | updown.delete | SUCCEEDED | 0.073s
      
      > zcat cron-4.2.log.gz | fgrep delete | fgrep SUCCEEDED | tail
      D, [2019-11-19T12:49:15.333525 #17303] DEBUG -- : MONGODB | db2.updn.io:27017 | updown.delete | SUCCEEDED | 1.529s
      D, [2019-11-19T12:49:17.627543 #17303] DEBUG -- : MONGODB | db2.updn.io:27017 | updown.delete | SUCCEEDED | 1.956s
      D, [2019-11-19T12:49:18.903869 #17303] DEBUG -- : MONGODB | db2.updn.io:27017 | updown.delete | SUCCEEDED | 0.278s
      D, [2019-11-19T12:49:21.606581 #17303] DEBUG -- : MONGODB | db2.updn.io:27017 | updown.delete | SUCCEEDED | 2.404s
      D, [2019-11-19T12:49:23.395561 #17303] DEBUG -- : MONGODB | db2.updn.io:27017 | updown.delete | SUCCEEDED | 1.506s
      D, [2019-11-19T12:49:24.454508 #17303] DEBUG -- : MONGODB | db2.updn.io:27017 | updown.delete | SUCCEEDED | 0.794s
      D, [2019-11-19T12:49:27.017986 #17303] DEBUG -- : MONGODB | db2.updn.io:27017 | updown.delete | SUCCEEDED | 2.242s
      D, [2019-11-19T12:49:28.572405 #17303] DEBUG -- : MONGODB | db2.updn.io:27017 | updown.delete | SUCCEEDED | 1.246s
      D, [2019-11-19T12:49:31.762715 #17303] DEBUG -- : MONGODB | db2.updn.io:27017 | updown.delete | SUCCEEDED | 2.878s
      D, [2019-11-19T12:56:04.510676 #17303] DEBUG -- : MONGODB | db2.updn.io:27017 | updown.delete | SUCCEEDED | 0.077s
      

      I'll upload the full log for the Oct 17th and Oct 19th samples and the diagnostic data as soon as you provide an upload portal.

        1. cpu-4.0.png
          cpu-4.0.png
          42 kB
        2. cpu 4.2.png
          cpu 4.2.png
          41 kB
        3. mongo-4.2-regression.png
          mongo-4.2-regression.png
          47 kB
        4. workload.png
          workload.png
          96 kB

            Assignee:
            bruce.lucas@mongodb.com Bruce Lucas (Inactive)
            Reporter:
            bigbourin@gmail.com Adrien Jarthon
            Votes:
            1 Vote for this issue
            Watchers:
            22 Start watching this issue

              Created:
              Updated:
              Resolved: