-
Type:
Bug
-
Resolution: Cannot Reproduce
-
Priority:
Major - P3
-
None
-
Affects Version/s: 2.6.4, 2.6.10
-
Component/s: Sharding
-
None
-
ALL
We have one cluster consisting of 5 shards, each consisting of 3 physical replset members. 3 configservers and 3 routers (mongos) are running on 3 different VM's, called sx350, sx351, sx352. We have also 3 other VM's, called offerstore-en-router-01, offerstore-en-router-02 and offerstore-en-router-03 where we have installed 3 other router (mongos).
One VM (sx352) went down at 7 o'clock, so its configserver and router crashed down as well.
The problem is that no connections through mongos on offerstore-en-router-01, offerstore-en-router-02 and offerstore-en-router-03 were possible until sx352 went back round about 20 minutes later after it had crashed down!
While sx352 was down, the mongoshell waited so long to connect (using auth) that I closed it before it came back. Without using --user and --password, the mongoshell could connect quickly but as soon as I entered db.auth("admin", "XXX"), the mongoshell blocked so I closed it after a few seconds.
Do you know why one crashed configserver is able to compromise the access to the cluster through mongos, running on a different VM's, and how one can avoid this issue?
Thanks!