|
now - running fine, not a peak traffic:
insert query update delete getmore command flushes mapped vsize res
faults locked % idx miss % qr|qw ar|aw netIn netOut conn time
21 4027 23 108 0 80 0 34g 51.8g 5.58g
55 4.4 0 0|0 1|0 659k 8m 4149 13:46:38
17 3150 15 99 0 84 0 34g 51.8g 5.58g
23 2.1 0 0|0 0|0 463k 6m 4144 13:46:39
4 3471 7 85 2 47 0 34g 51.8g 5.58g
26 1.6 0 0|0 1|0 436k 6m 4144 13:46:40
6 2960 8 115 0 3 0 34g 51.8g 5.58g
18 2.1 0 1|0 1|0 361k 5m 4144 13:46:41
19 3397 12 127 0 9 0 34g 51.8g 5.58g
27 2.6 0 0|0 0|0 476k 6m 4139 13:46:42
8 3206 10 85 0 219 0 34g 51.8g 5.58g
37 2 0 0|0 2|0 428k 7m 4139 13:46:43
14 3285 15 90 0 17 0 34g 51.7g 5.58g
45 7.8 0 0|0 1|0 451k 6m 4134 13:46:44
0 3233 4 104 2 29 0 34g 51.7g 5.58g
36 1.9 0 0|0 1|0 407k 6m 4134 13:46:45
0 2972 1 124 0 18 0 34g 51.7g 5.58g
26 2 0 0|0 1|0 364k 5m 4130 13:46:46
8 2874 9 78 0 9 0 34g 51.7g 5.58g
50 6.5 100 0|0 2|0 368k 6m 4130 13:46:47
insert query update delete getmore command flushes mapped vsize res
faults locked % idx miss % qr|qw ar|aw netIn netOut conn time
4 3282 9 113 0 17 0 34g 50.1g 5.75g
28 2.1 0 0|0 1|0 447k 5m 3706 13:58:25
7 2801 10 105 0 11 0 34g 50g 5.75g
21 3 0 0|0 0|0 448k 5m 3696 13:58:26
9 3078 4 104 0 22 0 34g 50g 5.75g
36 1.7 0 10|2 11|2 382k 5m 3691 13:58:27
15 2782 18 57 2 6 0 34g 50g 5.75g
35 1.5 0 0|0 1|1 417k 5m 3686 13:58:28
14 2972 14 149 0 9 0 34g 50g 5.75g
16 3 0 35|4 38|4 433k 5m 3686 13:58:29
23 3177 18 88 0 17 0 34g 50g 5.75g
25 4.8 0 0|0 0|0 492k 6m 3686 13:58:30
13 2654 11 76 2 14 0 34g 50g 5.75g
26 1.8 0 0|0 1|0 447k 5m 3687 13:58:31
14 3142 18 81 0 221 0 34g 50g 5.75g
40 23.3 0 27|10 34|10 422k 8m 3682 13:58:32
16 2995 13 63 2 12 0 34g 50g 5.75g
34 4.7 0 0|0 0|0 445k 6m 3677 13:58:33
8 3271 8 96 0 18 0 34g 49.9g 5.75g
34 1.9 0 48|2 50|3 545k 6m 3672 13:58:34
- free
total used free shared buffers cached
Mem: 7077888 7050196 27692 0 12876 6098780
-/+ buffers/cache: 938540 6139348
Swap: 2097144 72492 2024652
{
"db" : "xxxxx",
"collections" : 38,
"objects" : 15235640,
"avgObjSize" : 1659.7376002583417,
"dataSize" : 25287164572,
"storageSize" : 27740390048,
"numExtents" : 228,
"indexes" : 118,
"indexSize" : 7194130048,
"fileSize" : 38569771008,
"ok" : 1
}
problems last day:
insert query update delete getmore command flushes mapped vsize res
faults locked % idx miss % qr|qw ar|aw netIn netOut conn time
0 2624 1 60 0 7 0 34g 64.3g 5.05g
83 0.4 0 54|1010 181|1010 296k 4m 6950 16:18:33
1 2985 0 78 0 9 0 34g 64.4g 5.05g
109 4.8 0 33|1048 100|1048 376k 6m 6964 16:18:34
6 3034 3 85 0 5 0 34g 64.4g 5.05g
70 6.3 0 137|1060 190|1060 365k 5m 6971 16:18:35
3 3197 0 101 0 18 0 34g 64.5g 5.05g
88 1.7 0 47|1109 98|1109 385k 6m 6982 16:18:36
2 2780 1 83 0 7 0 34g 64.5g 5.05g
91 9.6 0 32|1120 84|1123 325k 5m 6991 16:18:37
2 2556 0 86 0 10 0 34g 64.5g 5.05g
74 1.4 0 55|1127 109|1128 307k 5m 6977 16:18:38
1 2511 3 75 0 20 0 34g 64.6g 5.06g
93 11.1 0 61|1141 111|1141 304k 5m 7008 16:18:39
9 2981 4 106 0 8 0 34g 64.6g 5.06g
92 4 0 32|1107 93|1107 361k 7m 7026 16:18:40
2 2521 5 74 0 6 1 34g 64.7g 5.06g
77 1.6 0 99|1100 157|1100 295k 6m 7033 16:18:41
8 3536 3 131 0 4 0 34g 64.6g 5.06g
94 3.6 0 15|1079 78|1080 432k 6m 7017 16:18:42
From my position it looks like there was a bit of starving of mongod from
disk io by other process, and then it hit 100% cpu wall (normally running
<60%) and couldn't recover as it sometimes does (when hitting cpu wall
mongod actually starts using less disk io). So it had long running queries
lagging in queue and then it couldn't show currentOp() of them. I did
restart mongod at some point and then it got into same trouble again pretty
soon.
After the traffic slowed down and the queue cleared, mongod started to run
currentOp() correctly without restart...
On Sat, Sep 10, 2011 at 7:38 AM, Eliot Horowitz (JIRA) <jira@mongodb.org>wrote:
|