-
Type:
Bug
-
Resolution: Incomplete
-
Priority:
Critical - P2
-
None
-
Affects Version/s: 2.1.1
-
Component/s: Index Maintenance, Sharding
-
None
-
Environment:linux 2.6.32
mongo version: v2.1.1-pre-, pdfile version 4.5, git version: a2d6f752d56aa446220b9f14c8ad3865c2fb5db8
-
ALL
Have two hards
First, I setup sharding, than "addShard()", than "enableShardion()", than "shardCollection()"
And not wait until all chunks are balanced, across cluster
And execute "removeshard", and not wait until all chunks migrate to main primary shard, create index on collection that sharded
And it looped in 2 state, at 88%
And migration chunks to primary also looped (at step3), at 13 chunks left (about 10 already migrated)
Collection rows: ~40 millions, avgObjSize: 680
On query execution in a new connection it write:
mongos> db.currentOp().inprog Fri May 4 20:30:34 uncaught exception: error { "$err" : "socket exception", "code" : 11002 }
And then mongod killed
I start it again
And it now it stop indexing at 55% (in db.currentOp() no such operation, but I see it in log, each ~15 secs write "TIME [migrateThread] 23349300/42063688 55%"
And now I run "iostat -x 2" as well
$iostat -x 2 Linux 2.6.32-5-xen-amd64 (ip-10-252-43-199) 05/04/2012 _x86_64_ (1 CPU) avg-cpu: %user %nice %system %iowait %steal %idle 6.22 0.00 3.37 31.69 8.57 50.15 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util xvdap1 0.00 0.17 2.71 0.12 63.39 3.00 46.95 0.03 12.13 4.59 182.09 1.48 0.42 xvdb 0.16 627.80 380.62 78.33 15775.49 2846.00 81.15 17.80 38.79 1.89 218.07 0.54 25.00 xvdap3 0.00 0.00 0.00 0.00 0.00 0.00 8.00 0.00 1.36 1.36 0.00 1.36 0.00 avg-cpu: %user %nice %system %iowait %steal %idle 0.28 0.00 6.76 90.99 1.97 0.00 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util xvdap1 0.00 0.00 11.27 0.00 267.04 0.00 47.40 0.03 2.60 2.60 0.00 2.60 2.93 xvdb 0.00 0.00 2568.17 0.28 105478.31 1.13 82.13 3.09 1.21 1.21 0.00 0.21 53.18 xvdap3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 avg-cpu: %user %nice %system %iowait %steal %idle 0.57 0.00 8.88 86.82 3.72 0.00 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util xvdap1 0.00 0.00 12.03 0.00 317.48 0.00 52.76 0.00 0.00 0.00 0.00 0.00 0.00 xvdb 0.00 0.00 2784.24 0.00 117375.36 0.00 84.31 3.27 1.17 1.17 0.00 0.19 53.07 xvdap3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 avg-cpu: %user %nice %system %iowait %steal %idle 0.29 0.00 8.86 86.86 4.00 0.00 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util xvdap1 0.00 0.00 6.00 0.00 168.00 0.00 56.00 0.00 0.00 0.00 0.00 0.00 0.00 xvdb 0.00 0.00 2752.00 0.00 116163.43 0.00 84.42 3.36 1.22 1.22 0.00 0.20 54.06 xvdap3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 avg-cpu: %user %nice %system %iowait %steal %idle 0.28 0.00 7.00 92.44 0.28 0.00 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util xvdap1 0.00 0.00 40.90 0.00 1182.07 0.00 57.81 0.05 1.12 1.12 0.00 0.27 1.12 xvdb 0.00 0.00 2924.09 0.00 123207.84 0.00 84.27 3.25 1.11 1.11 0.00 0.19 54.23 xvdap3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 avg-cpu: %user %nice %system %iowait %steal %idle 0.84 0.00 8.38 90.22 0.56 0.00 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util xvdap1 0.00 0.00 6.98 0.00 132.96 0.00 38.08 0.00 0.00 0.00 0.00 0.00 0.00 xvdb 0.00 0.00 2916.76 0.00 122908.38 0.00 84.28 3.05 1.05 1.05 0.00 0.18 52.18 xvdap3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 avg-cpu: %user %nice %system %iowait %steal %idle 0.56 0.00 8.47 88.98 1.98 0.00 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util xvdap1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 xvdb 0.00 0.00 2829.10 0.00 119282.49 0.00 84.33 3.16 1.12 1.12 0.00 0.19 53.56 xvdap3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 avg-cpu: %user %nice %system %iowait %steal %idle 0.57 0.00 7.12 89.46 2.85 0.00 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util xvdap1 0.00 0.00 4.56 0.00 104.84 0.00 46.00 0.00 0.00 0.00 0.00 0.00 0.00 xvdb 0.00 0.00 2859.54 0.00 120461.54 0.00 84.25 3.18 1.11 1.11 0.00 0.19 53.56 xvdap3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 avg-cpu: %user %nice %system %iowait %steal %idle 0.28 0.00 8.40 91.04 0.28 0.00 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util xvdap1 0.00 0.00 3.64 0.00 58.26 0.00 32.00 0.00 0.00 0.00 0.00 0.00 0.00 xvdb 0.00 0.00 2909.52 0.00 122330.53 0.00 84.09 3.00 1.03 1.03 0.00 0.18 51.99 xvdap3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 avg-cpu: %user %nice %system %iowait %steal %idle 0.57 0.00 6.82 90.62 1.99 0.00 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util xvdap1 0.00 0.00 6.25 0.00 154.55 0.00 49.45 0.00 0.00 0.00 0.00 0.00 0.00 xvdb 0.00 0.00 2886.65 0.00 121575.00 0.00 84.23 3.30 1.14 1.14 0.00 0.19 54.32 xvdap3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 avg-cpu: %user %nice %system %iowait %steal %idle 0.28 0.00 10.06 89.11 0.56 0.00 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util xvdap1 0.00 0.00 3.35 0.00 56.98 0.00 34.00 0.00 1.00 1.00 0.00 0.67 0.22 xvdb 0.00 0.00 2862.57 0.00 120623.46 0.00 84.28 3.08 1.08 1.08 0.00 0.18 52.51 xvdap3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 avg-cpu: %user %nice %system %iowait %steal %idle 0.56 0.00 4.78 90.45 4.21 0.00 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util xvdap1 0.00 0.00 60.11 0.28 1544.94 2.25 51.24 0.01 0.15 0.15 0.00 0.06 0.34 xvdb 0.00 0.00 1843.82 0.00 77457.30 0.00 84.02 2.80 1.52 1.52 0.00 0.24 45.17 xvdap3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 avg-cpu: %user %nice %system %iowait %steal %idle 0.26 0.00 1.29 96.64 1.81 0.00 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util xvdap1 0.00 0.00 140.05 0.00 3919.38 0.00 55.97 0.01 0.07 0.07 0.00 0.03 0.41 xvdb 0.00 1.55 154.78 2.58 5584.50 17.57 71.20 1.80 11.45 11.50 8.80 2.19 34.42 xvdap3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 avg-cpu: %user %nice %system %iowait %steal %idle 0.00 0.00 1.04 96.11 2.85 0.00 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util xvdap1 0.00 0.00 54.92 0.00 1367.88 0.00 49.81 0.00 0.00 0.00 0.00 0.00 0.00 xvdb 0.00 0.00 202.07 0.00 6851.81 0.00 67.82 1.66 8.12 8.12 0.00 2.55 51.61 xvdap3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 avg-cpu: %user %nice %system %iowait %steal %idle 0.00 0.00 0.77 98.98 0.26 0.00 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util xvdap1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 xvdb 0.00 0.00 386.45 0.00 14086.96 0.00 72.91 1.89 4.93 4.93 0.00 1.32 50.95 xvdap3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00