Problem:
During a mongodump the mongos process that it was connected to died. In the logs the following was seen
hu May 19 02:01:07 [LockPinger] dist_lock pinged successfully for: us0101aej024.tangome.gbl:1305761193:1804289383
Thu May 19 02:04:21 [mongosMain] connection accepted from 127.0.0.1:39876 #8
Thu May 19 02:04:33 [conn8] got not master for: us0101amd206
Thu May 19 02:04:39 [conn8] end connection 127.0.0.1:39876
Received signal 6
Backtrace: 0x52e235 0x301ac302d0 0x301ac30265 0x301ac31d10 0x301ac296e6 0x5517c3 0x552208 0x54a10c 0x53fcca 0x577eae 0x5789dc 0x69dae1 0x69ec38 0x301b40673d 0x301acd3f6d
/local/mongo/bin/mongos(_ZN5mongo17printStackAndExitEi+0x75)[0x52e235]
/lib64/libc.so.6[0x301ac302d0]
/lib64/libc.so.6(gsignal+0x35)[0x301ac30265]
/lib64/libc.so.6(abort+0x110)[0x301ac31d10]
/lib64/libc.so.6(__assert_fail+0xf6)[0x301ac296e6]
/local/mongo/bin/mongos(_ZN5mongo18DBClientReplicaSet11checkMasterEv+0x4b3)[0x5517c3]
/local/mongo/bin/mongos(_ZN5mongo18DBClientReplicaSet7findOneERKSsRKNS_5QueryEPKNS_7BSONObjEi+0x128)[0x552208]
/local/mongo/bin/mongos(_ZN5mongo20DBClientWithCommands10runCommandERKSsRKNS_7BSONObjERS3_i+0x8c)[0x54a10c]
/local/mongo/bin/mongos(ZN5mongo20DBClientWithCommands13simpleCommandERKSsPNS_7BSONObjES2+0x12a)[0x53fcca]
/local/mongo/bin/mongos(_ZN5mongo17ClientConnections7releaseERKSsPNS_12DBClientBaseE+0x10e)[0x577eae]
/local/mongo/bin/mongos(_ZN5boost19thread_specific_ptrIN5mongo17ClientConnectionsEE11delete_dataclEPv+0xac)[0x5789dc]
/local/mongo/bin/mongos(tls_destructor+0xb1)[0x69dae1]
/local/mongo/bin/mongos(thread_proxy+0x88)[0x69ec38]
/lib64/libpthread.so.0[0x301b40673d]
/lib64/libc.so.6(clone+0x6d)[0x301acd3f6d]
===
Received signal 11
Backtrace: 0x52e235 0x301ac302d0 0x53fc8d 0x577eae 0x5789dc 0x69f3c1 0x573b22 0x301ac333a5 0x52e29b 0x301ac302d0 0x301ac30265 0x301ac31d10 0x301ac296e6 0x5517c3 0x552208 0x54a10c 0x53fcca 0x577eae 0x5789dc 0x69dae1
/local/mongo/bin/mongos(_ZN5mongo17printStackAndExitEi+0x75)[0x52e235]
/lib64/libc.so.6[0x301ac302d0]
/local/mongo/bin/mongos(ZN5mongo20DBClientWithCommands13simpleCommandERKSsPNS_7BSONObjES2+0xed)[0x53fc8d]
/local/mongo/bin/mongos(_ZN5mongo17ClientConnections7releaseERKSsPNS_12DBClientBaseE+0x10e)[0x577eae]
/local/mongo/bin/mongos(_ZN5boost19thread_specific_ptrIN5mongo17ClientConnectionsEE11delete_dataclEPv+0xac)[0x5789dc]
/local/mongo/bin/mongos(_ZN5boost6detail12set_tss_dataEPKvNS_10shared_ptrINS0_20tss_cleanup_functionEEEPvb+0x151)[0x69f3c1]
/local/mongo/bin/mongos[0x573b22]
/lib64/libc.so.6(exit+0xe5)[0x301ac333a5]
/local/mongo/bin/mongos[0x52e29b]
/lib64/libc.so.6[0x301ac302d0]
/lib64/libc.so.6(gsignal+0x35)[0x301ac30265]
/lib64/libc.so.6(abort+0x110)[0x301ac31d10]
/lib64/libc.so.6(__assert_fail+0xf6)[0x301ac296e6]
/local/mongo/bin/mongos(_ZN5mongo18DBClientReplicaSet11checkMasterEv+0x4b3)[0x5517c3]
/local/mongo/bin/mongos(_ZN5mongo18DBClientReplicaSet7findOneERKSsRKNS_5QueryEPKNS_7BSONObjEi+0x128)[0x552208]
/local/mongo/bin/mongos(_ZN5mongo20DBClientWithCommands10runCommandERKSsRKNS_7BSONObjERS3_i+0x8c)[0x54a10c]
/local/mongo/bin/mongos(ZN5mongo20DBClientWithCommands13simpleCommandERKSsPNS_7BSONObjES2+0x12a)[0x53fcca]
/local/mongo/bin/mongos(_ZN5mongo17ClientConnections7releaseERKSsPNS_12DBClientBaseE+0x10e)[0x577eae]
/local/mongo/bin/mongos(_ZN5boost19thread_specific_ptrIN5mongo17ClientConnectionsEE11delete_dataclEPv+0xac)[0x5789dc]
/local/mongo/bin/mongos(tls_destructor+0xb1)[0x69dae1]
===
Thu May 19 02:04:39 CursorCache at shutdown - sharded: 1 passthrough: 0
Looking at the rs.status() one of the Replica Sets was in a bad state
xxxxxSet6:PRIMARY> rs.status();
{
"set" : "xxxxxSet6",
"date" : ISODate("2011-05-19T03:45:15Z"),
"myState" : 1,
"members" : [
{
"_id" : 0,
"name" : "us0101amd106.xxxxx.gbl:27017",
"health" : 1,
"state" : 3,
"stateStr" : "RECOVERING",
"uptime" : 535081,
"optime" :
,
"optimeDate" : ISODate("2011-05-15T21:27:11Z"),
"lastHeartbeat" : ISODate("2011-05-19T03:45:14Z"),
"errmsg" : "error RS102 too stale to catch up"
},
{
"_id" : 1,
"name" : "us0101amd206",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"optime" :
,
"optimeDate" : ISODate("2011-05-19T03:45:15Z"),
"self" : true
},
{
"_id" : 2,
"name" : "us0101amd306",
"health" : 1,
"state" : 3,
"stateStr" : "RECOVERING",
"uptime" : 535372,
"optime" :
,
"optimeDate" : ISODate("2011-05-16T06:55:27Z"),
"lastHeartbeat" : ISODate("2011-05-19T03:45:14Z"),
"errmsg" : "error RS102 too stale to catch up"
}
],
"ok" : 1
}