Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-25249

High I/O load in secondary nodes with WiredTiger engine (caused by journaling)

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 3.2.7
    • Component/s: WiredTiger
    • ALL
    • Hide

      Simply insert data heavily.

      Show
      Simply insert data heavily.

      Hi,
      I upgraded my mongodb cluster from 3.0 to 3.2.7 recently and found that the I/O load increased a lot. (abount 100% in primary nodes and 500% in secondary nodes, mainly judged by %util of iostat command).
      After reading the doc, I learned that journaling behavior changed a little in 3.2 (flushes the journal every 50ms), so I tried disabling the journaling and the I/O load returned low as it was in 3.0. So I guess flushing the journal frequently is the main reason of primary nodes' I/O load going high 100%.
      But I can't find out why secondaries' I/O load increased 500% in the docs. So I used strace to track the mongod thread which is in charge of flushing the journal. (strace about 10 seconds)

      Primary node:

      % time     seconds  usecs/call     calls    errors syscall
      ------ ----------- ----------- --------- --------- ----------------
       58.86    3.921470      13476       291       143   futex
       40.54    2.700511      18754       144             pwrite
        0.30    0.020001      10001         2             fdatasync
      

      Secondary node:

      % time     seconds  usecs/call     calls    errors syscall
      ------ ----------- ----------- --------- --------- ----------------
       83.04    4.272435       14682       291             fdatasync
       16.95    0.871993         998       874         9   futex
        0.01    0.000461           2       288             pwrite
      

      From the above we know that pwrite calls in secondary node are nearly twice of primary. And fdatasync calls are as many as pwrite calls and are far more than primary's. Is this the reason why secondaries' I/O load increase 500% ? Is it a bug or a design?

            Assignee:
            kelsey.schubert@mongodb.com Kelsey Schubert
            Reporter:
            strlee stronglee
            Votes:
            0 Vote for this issue
            Watchers:
            12 Start watching this issue

              Created:
              Updated:
              Resolved: