Details
-
Bug
-
Status: Closed
-
Major - P3
-
Resolution: Works as Designed
-
3.2.7
-
None
-
ALL
-
Description
Hi,
I upgraded my mongodb cluster from 3.0 to 3.2.7 recently and found that the I/O load increased a lot. (abount 100% in primary nodes and 500% in secondary nodes, mainly judged by %util of iostat command).
After reading the doc, I learned that journaling behavior changed a little in 3.2 (flushes the journal every 50ms), so I tried disabling the journaling and the I/O load returned low as it was in 3.0. So I guess flushing the journal frequently is the main reason of primary nodes' I/O load going high 100%.
But I can't find out why secondaries' I/O load increased 500% in the docs. So I used strace to track the mongod thread which is in charge of flushing the journal. (strace about 10 seconds)
Primary node:
% time seconds usecs/call calls errors syscall
|
------ ----------- ----------- --------- --------- ----------------
|
58.86 3.921470 13476 291 143 futex
|
40.54 2.700511 18754 144 pwrite
|
0.30 0.020001 10001 2 fdatasync
|
Secondary node:
% time seconds usecs/call calls errors syscall
|
------ ----------- ----------- --------- --------- ----------------
|
83.04 4.272435 14682 291 fdatasync
|
16.95 0.871993 998 874 9 futex
|
0.01 0.000461 2 288 pwrite
|
From the above we know that pwrite calls in secondary node are nearly twice of primary. And fdatasync calls are as many as pwrite calls and are far more than primary's. Is this the reason why secondaries' I/O load increase 500% ? Is it a bug or a design?
Attachments
Issue Links
- is related to
-
SERVER-26040 High CPU/IOWAIT in MongoDB 3.2.9
-
- Closed
-
- related to
-
SERVER-53667 High rate of journal flushes on secondary in 4.4
-
- Closed
-