Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Cannot Reproduce
Priority: Major - P3
Fix Version/s: None
Affects Version/s: 2.6.4, 2.6.10
Component/s: Sharding
Labels:
None

Operating System:
ALL
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

We have one cluster consisting of 5 shards, each consisting of 3 physical replset members. 3 configservers and 3 routers (mongos) are running on 3 different VM's, called sx350, sx351, sx352. We have also 3 other VM's, called offerstore-en-router-01, offerstore-en-router-02 and offerstore-en-router-03 where we have installed 3 other router (mongos).
One VM (sx352) went down at 7 o'clock, so its configserver and router crashed down as well.

The problem is that no connections through mongos on offerstore-en-router-01, offerstore-en-router-02 and offerstore-en-router-03 were possible until sx352 went back round about 20 minutes later after it had crashed down!

While sx352 was down, the mongoshell waited so long to connect (using auth) that I closed it before it came back. Without using --user and --password, the mongoshell could connect quickly but as soon as I entered db.auth("admin", "XXX"), the mongoshell blocked so I closed it after a few seconds.

Do you know why one crashed configserver is able to compromise the access to the cluster through mongos, running on a different VM's, and how one can avoid this issue?
Thanks!

Assignee:: Ramon Fernandez Marina
Reporter:: Kay Agahd
Participants:: Alexander Bulaev, Kay Agahd, Ramon Fernandez Marina
Votes:: 0 Vote for this issue
Watchers:: 7 Start watching this issue

Created:: Oct 06 2015 01:22:08 PM UTC
Updated:: Feb 24 2016 01:50:23 PM UTC
Resolved:: Feb 10 2016 07:58:53 PM UTC

Details

Description

Attachments

Forms

Activity

People

Dates