-
Type: Bug
-
Resolution: Cannot Reproduce
-
Priority: Major - P3
-
None
-
Affects Version/s: 2.6.10
-
Component/s: MMAPv1, Replication, Usability
-
Labels:None
-
ALL
There is only mongod running on the machine. In daylight, mongod instance will hang more than 30s a few times.
When mongod hanging:
- There is no logs ouputs
- mongostat which connected to the mongod will stop ouputing stat
- query get no response( then timeout )
- no disk read operations, and a few disk write operstions
- MajorPF decrease from 20~30 to 0, MinPF decrease from 20~30k to less than 10k
- network read traffic decrease from 600kb+ to 10~150kb, write traffic decrease from 3MB+ to 10~30kb
Here is collected dstat data when mongod(PRIMARY) hanging:
time |run blk new|usr sys idl wai hiq siq| read writ| read writ| recv send| in out | used free|files inodes|tot tcp udp raw frg|majpf minpf alloc free
17-06 10:13:05|4.0 0 1.0| 9 5 87 0 0 0|3336k 2472k|50.0 15.0 | 786k 13M| 0 0 | 38M 474M|20720 96300 | 2 2 9 0 0| 33 6634 8385 4439
17-06 10:13:06|3.0 0 1.0| 10 5 85 0 0 0|2728k 2328k|43.0 14.0 | 873k 9604k| 0 0 | 38M 474M|20720 96300 | 2 2 9 0 0| 23 27k 10k 12k
17-06 10:13:07|3.0 0 7.0| 8 5 87 0 0 0|1728k 2204k|28.0 15.0 | 724k 5850k| 0 0 | 38M 474M|20720 96298 | 2 2 9 0 0| 19 31k 10k 8094
17-06 10:13:08|2.0 0 0| 8 5 87 0 0 0|2136k 65M|31.0 3886 | 609k 9773k| 0 0 | 38M 474M|20720 96298 | 2 2 9 0 0| 21 4047 4987 3614
17-06 10:13:09|2.0 0 0| 7 5 88 0 0 0|1212k 1248k|17.0 9.00 | 475k 3621k| 0 0 | 38M 474M|20720 96307 | 2 2 9 0 0| 8 7449 6747 4435
.... hang here .....
17-06 10:13:10|3.0 0 0| 4 4 91 0 0 0| 0 0 | 0 0 | 121k 17k| 0 0 | 38M 474M|20720 96307 | 2 2 9 0 0| 0 1121 1095 1081
17-06 10:13:11|2.0 0 1.0| 4 4 91 0 0 0| 0 0 | 0 0 | 109k 19k| 0 0 | 38M 474M|20720 96307 | 2 2 9 0 0| 0 1944 1874 1875
17-06 10:13:12|2.0 0 7.0| 4 5 91 0 0 0| 0 16k| 0 2.00 | 95k 18k| 0 0 | 38M 474M|20720 96298 | 2 2 9 0 0| 0 8095 5435 5427
17-06 10:13:13|2.0 0 0| 4 4 92 0 0 0| 0 0 | 0 0 | 111k 27k| 0 0 | 38M 474M|20720 96301 | 2 2 9 0 0| 0 88 13 17
17-06 10:13:14|2.0 0 1.0| 4 5 91 0 0 0| 0 156k| 0 2.00 | 63k 14k| 0 0 | 38M 474M|20720 96307 | 2 2 9 0 0| 0 3235 2951 2951
17-06 10:13:15|2.0 0 0| 4 5 91 0 0 0| 0 0 | 0 0 | 47k 13k| 0 0 | 38M 474M|20720 96307 | 2 2 9 0 0| 0 3308 2934 2934
17-06 10:13:16|2.0 0 2.0| 4 4 92 0 0 0| 0 0 | 0 0 | 17k 9630B| 0 0 | 38M 474M|20720 96300 | 2 2 9 0 0| 0 268 18 18
17-06 10:13:17|3.0 0 5.0| 4 5 91 0 0 0| 0 16k| 0 2.00 |8419B 6844B| 0 0 | 38M 474M|20720 96301 | 2 2 9 0 0| 0 6301 4134 3144
17-06 10:13:18|2.0 0 3.0| 4 4 91 0 0 0| 0 0 | 0 0 |4435B 4755B| 0 0 | 38M 474M|20720 96299 | 2 2 9 0 0| 0 2101 1437 2418
17-06 10:13:19|2.0 0 0| 4 5 91 0 0 0| 0 0 | 0 0 | 44k 12k| 0 0 | 38M 474M|20720 96315 | 2 2 9 0 0| 0 3225 2954 2954
17-06 10:13:20|2.0 0 0| 4 5 91 0 0 0| 0 0 | 0 0 | 158k 29k| 0 0 | 38M 474M|20720 96315 | 2 2 9 0 0| 0 3529 2973 2973
17-06 10:13:21|2.0 0 0| 4 4 92 0 0 0| 0 12k| 0 2.00 | 117k 27k| 0 0 | 38M 474M|20720 96315 | 2 2 9 0 0| 0 96 2 2
17-06 10:13:22|2.0 0 0| 4 5 91 0 0 0| 0 0 | 0 0 | 101k 19k| 0 0 | 38M 474M|20720 96315 | 2 2 9 0 0| 0 2967 2946 2944
17-06 10:13:23|2.0 0 8.0| 5 5 91 0 0 0| 0 16k| 0 2.00 | 166k 28k| 0 0 | 38M 474M|20720 96315 | 2 2 9 0 0| 0 9979 6095 5909
17-06 10:13:24|2.0 0 1.0| 4 4 92 0 0 0| 0 0 | 0 0 | 90k 20k| 0 0 | 38M 474M|20720 96316 | 2 2 9 0 0| 0 90 8 3
17-06 10:13:25|2.0 0 5.0| 4 4 91 0 0 0| 0 0 | 0 0 | 52k 17k| 0 0 | 38M 474M|20720 96323 | 2 2 9 0 0| 0 4191 3261 3247
17-06 10:13:26|2.0 0 1.0| 4 4 92 0 0 0| 0 0 | 0 0 | 16k 8844B| 0 0 | 38M 474M|20720 96323 | 2 2 9 0 0| 0 75 6 11
17-06 10:13:27|2.0 0 0| 4 4 91 0 0 0| 0 0 | 0 0 | 10k 9190B| 0 0 | 38M 474M|20720 96323 | 2 2 9 0 0| 0 2986 2938 2936
17-06 10:13:28|2.0 0 7.0| 4 5 91 0 0 0| 0 28k| 0 4.00 |8656B 5238B| 0 0 | 38M 474M|20720 96323 | 2 2 9 0 0| 0 9222 6140 6089
17-06 10:13:29|2.0 0 0| 4 4 92 0 0 0| 0 0 | 0 0 | 24k 8107B| 0 0 | 38M 474M|20720 96323 | 2 2 9 0 0| 0 85 9 9
17-06 10:13:30|2.0 0 0| 4 4 91 0 0 0| 0 0 | 0 0 | 131k 27k| 0 0 | 38M 474M|20720 96323 | 2 2 9 0 0| 0 2994 2943 2940
17-06 10:13:31|2.0 0 25| 4 4 91 0 0 0| 0 0 | 0 0 | 130k 35k| 0 0 | 38M 474M|20720 96432 | 2 2 9 0 0| 0 3113 3207 3025
17-06 10:13:32|2.0 0 45| 4 4 92 0 0 0| 0 0 | 0 0 | 100k 34k| 0 0 | 38M 474M|20720 96478 | 2 2 9 0 0| 0 454 511 25
17-06 10:13:33|2.0 0 50| 5 5 91 0 0 0| 0 10M| 0 790 | 112k 34k| 0 0 | 38M 474M|20720 97036 | 2 2 9 0 0| 0 11k 7378 6799
17-06 10:13:34|3.0 0 62| 4 4 91 0 0 0| 0 112k| 0 8.00 | 157k 44k| 0 0 | 38M 474M|20720 97098 | 2 2 9 0 0| 0 2957 2862 2154
17-06 10:13:35|2.0 0 20| 4 4 92 0 0 0| 0 0 | 0 0 | 44k 15k| 0 0 | 38M 474M|20720 97719 | 2 2 9 0 0| 0 1853 1544 1321
17-06 10:13:36|2.0 0 9.0| 4 4 91 0 0 0| 0 52k| 0 6.00 | 20k 11k| 0 0 | 38M 474M|20720 97916 | 2 2 9 0 0| 0 3414 3362 3250
17-06 10:13:37|2.0 0 3.0| 4 4 92 0 0 0| 0 92k| 0 6.00 | 40k 8092B| 0 0 | 38M 474M|20720 97919 | 2 2 9 0 0| 0 647 67 21
17-06 10:13:38|2.0 0 11| 5 5 91 0 0 0| 0 44k| 0 2.00 | 10k 10k| 0 0 | 38M 474M|20720 97259 | 2 2 9 0 0| 0 11k 7395 22k
17-06 10:13:39|2.0 0 32| 4 4 91 0 0 0| 0 32k| 0 6.00 | 27k 18k| 0 0 | 38M 474M|20720 97336 | 2 2 9 0 0| 0 7432 4474 4165
17-06 10:13:40|5.0 0 104| 4 5 91 0 0 0| 0 0 | 0 0 | 170k 63k| 0 0 | 38M 474M|20768 97441 | 2 2 9 0 0| 0 1581 1540 15k
17-06 10:13:41|2.0 0 95| 4 5 91 0 0 0| 0 0 | 0 0 | 159k 59k| 0 0 | 38M 474M|20768 98703 | 2 2 9 0 0| 0 5230 5358 17k
17-06 10:13:42|2.0 0 89| 4 5 91 0 0 0| 0 0 | 0 0 | 132k 54k| 0 0 | 38M 474M|20816 99730 | 3 2 9 0 0| 0 4978 5424 5233
17-06 10:13:43|2.0 0 83| 4 5 91 0 0 0| 0 32k| 0 4.00 | 109k 43k| 0 0 | 38M 474M|20864 99800 | 3 2 9 0 0| 0 8355 5044 5214
..... resumed here ....
17-06 10:13:44|5.0 0 78| 17 7 75 0 0 0|8052k 14M| 305 17.7k|1673k 17M| 0 0 | 38M 474M|20912 97331 | 2 2 9 0 0| 92 42k 28k 29k
17-06 10:13:45|7.0 0 0| 20 6 74 0 0 0|2828k 2980k| 118 18.0 |1568k 6635k| 0 0 | 38M 474M|20912 97331 | 2 2 9 0 0| 32 28k 17k 12k
17-06 10:13:46|3.0 0 2.0| 17 6 77 0 0 0|1256k 1852k|37.0 12.0 |1370k 3418k| 0 0 | 38M 474M|20912 97338 | 2 2 9 0 0| 13 21k 12k 11k
17-06 10:13:47|6.0 0 0| 18 6 75 0 0 0|6864k 4564k| 156 22.0 |2001k 27M| 0 0 | 38M 474M|20912 97345 | 2 2 9 0 0| 58 36k 25k 23k