-
Type:
Bug
-
Resolution: Done
-
Priority:
Major - P3
-
None
-
Affects Version/s: 1.6.4
-
Component/s: Sharding
-
None
-
Environment:CentOS release 5.4 (Final) mongodb-linux-x86_64-1.6.4.tgz
-
Linux
-
None
-
3
-
None
-
None
-
None
-
None
-
None
-
None
Hi there,
As far as i know, the current suggestion to restore a broken configsvr ( or to make a new configsvr according to one the two healthy survivors ) is to copy the entire directory of one survivor and use the generated directory to hold a new configsvr.
For some test, it just cannot go through.
Something like this:
Destroy a configsvr:
a) kill one configsvr (based at csv59900)
b) move it's dbdir to somewhere else ( mv csv59900 csv59909)
Now there are only two configsvrs alive (csv59901 and csv59902). Assuming that csv59900 is unreachable, start to restore one configsvr at 59903:
c) cp -r csv59902 csv59903
d) rm -rf csv59903/mongod.lock
e) start configsvr 59903
f) start mongos at port 58800 with configsvr = localhost:59901,localhost:59902,localhost:59903
To see if the data available:
g) mongo localhost:58800
h) show dbs (the result is correct)
i) show collections (the result is correct)
j) count one collection (error)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> db.t0.count()
Tue Dec 21 12:04:35 uncaught exception: count failed: {
"assertion" : "setShardVersion failed!
",
"assertionCode" : 10429,
"errmsg" : "db assertion failure",
"ok" : 0
}
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
All collections are not available.
Logs from mongos:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Tue Dec 21 12:04:35 [conn1] setShardVersion failed:
Tue Dec 21 12:04:35 [conn1] Assertion: 10429:setShardVersion failed!
{ "errmsg" : "specified a different configdb!", "ok" : 0 }0x50997e 0x60bdd4 0x60b89c 0x60b89c 0x60b89c 0x554788 0x5559ec 0x551e39 0x552216 0x63e312 0x559297 0x61629c 0x64407a 0x6519b9 0x55d3b2 0x674ca0 0x3a6d8064a7 0x3a6d0d3c2d
/data0/mongo/mongodb/bin/mongos(_ZN5mongo11msgassertedEiPKc+0x1de) [0x50997e]
/data0/mongo/mongodb/bin/mongos(_ZN5mongo17checkShardVersionERNS_12DBClientBaseERKSsbi+0xa34) [0x60bdd4]
/data0/mongo/mongodb/bin/mongos(_ZN5mongo17checkShardVersionERNS_12DBClientBaseERKSsbi+0x4fc) [0x60b89c]
/data0/mongo/mongodb/bin/mongos(_ZN5mongo17checkShardVersionERNS_12DBClientBaseERKSsbi+0x4fc) [0x60b89c]
/data0/mongo/mongodb/bin/mongos(_ZN5mongo17checkShardVersionERNS_12DBClientBaseERKSsbi+0x4fc) [0x60b89c]
/data0/mongo/mongodb/bin/mongos(_ZN5mongo17ClientConnections13checkVersionsERKSs+0x288) [0x554788]
/data0/mongo/mongodb/bin/mongos(ZN5mongo17ClientConnections3getERKSsS2+0x39c) [0x5559ec]
/data0/mongo/mongodb/bin/mongos(_ZN5mongo15ShardConnection5_initEv+0x59) [0x551e39]
/data0/mongo/mongodb/bin/mongos(_ZN5mongo15ShardConnectionC1ERKNS_5ShardERKSs+0x86) [0x552216]
/data0/mongo/mongodb/bin/mongos(_ZN5mongo15dbgrid_pub_cmds8CountCmd3runERKSsRNS_7BSONObjERSsRNS_14BSONObjBuilderEb+0xc62) [0x63e312]
/data0/mongo/mongodb/bin/mongos(_ZN5mongo7Command20runAgainstRegisteredEPKcRNS_7BSONObjERNS_14BSONObjBuilderE+0x5d7) [0x559297]
/data0/mongo/mongodb/bin/mongos(_ZN5mongo14SingleStrategy7queryOpERNS_7RequestE+0x26c) [0x61629c]
/data0/mongo/mongodb/bin/mongos(_ZN5mongo7Request7processEi+0x26a) [0x64407a]
/data0/mongo/mongodb/bin/mongos(_ZN5mongo21ShardedMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortE+0x149) [0x6519b9]
/data0/mongo/mongodb/bin/mongos(_ZN5mongo3pms9threadRunEPNS_13MessagingPortE+0x252) [0x55d3b2]
/data0/mongo/mongodb/bin/mongos(thread_proxy+0x80) [0x674ca0]
/lib64/libpthread.so.0 [0x3a6d8064a7]
/lib64/libc.so.6(clone+0x6d) [0x3a6d0d3c2d]
Tue Dec 21 12:12:02 [Balancer] dist_lock forcefully taking over from:
elapsed minutes: 11
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
Any ideas?