[SERVER-5322] better messaging when cannot connect to mongos when shard is down Created: 15/Mar/12  Updated: 20/Jul/15  Resolved: 20/Jul/15

Status: Closed
Project: Core Server
Component/s: Security, Sharding
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Minor - P4
Reporter: Kristina Chodorow (Inactive) Assignee: Unassigned
Resolution: Done Votes: 2
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-5406 mongos does not allow new connections... Closed
Operating System: ALL
Participants:

 Description   

Resolved current issue - leaving open for better reporting when mongos auth fails to a sharded database.

=====

See https://groups.google.com/forum/?fromgroups#!topic/mongodb-user/NREVf6tuvbA

$ mongo localhost:27123/db -uusername -ppassword
MongoDB shell version: 2.0.3
connecting to: localhost:27123/db
Wed Mar 14 11:01:47 uncaught exception: error { "$err" : "socket exception", "code" : 11002 }
exception: login failed
$

Seems related to SERVER-3763 and SERVER-4643, but they are on 2.0.3 and it isn't working.



 Comments   
Comment by Andy Schwerin [ 20/Jul/15 ]

As of 2.6, auth data is stored in the config database, not in the shards, so this error should have gone away.

Comment by Greg Studer [ 13/Jun/12 ]

Changed priority to minor, this has become a log issue.

Comment by Vasyl [ 18/Apr/12 ]

Thanks Greg! Appreciate it.

Vasyl

Comment by Greg Studer [ 18/Apr/12 ]

Agree - updated.

Comment by Vasyl [ 18/Apr/12 ]

You're right Greg.
This makes sense in some point

I think auth=off & no --keyFile option should be mentioned in failover part then
(3. Failure of all mongod servers comprising a shard.)
http://www.mongodb.org/display/DOCS/Sharding+and+Failover

Vasyl

Comment by Greg Studer [ 18/Apr/12 ]

I'm not yet convinced this is actually the best way forward - for example some authentication data sits on the config servers, which are always up by definition if the cluster is up. Also, there's always the ability to add more replica set servers to the primary shard, to get whatever level of redundancy you think is needed. Spreading the auth data around causes a lot of problems, in terms of migrations, shards added / removed, stale auth data, etc - if increasing uptime is the goal, there are a lot of options.

Feel free to open an improvement ticket though, to tell us how you think things should behave and we could have more discussion there.

Comment by Vasyl [ 18/Apr/12 ]

Hey Greg!

Yep, I'm totaly agree - auth feature needs to be improved.

We do using repSets so we are safe in some point. But not if primary shard goes down.
It would be extremely nice to allow end user resides on other shard(s) access their data in such case

Could this be tracked in jira, pls?
I mean the the idea of sharing system users among all shards, b/c of single point of failure?

Thanks
Vasyl

Comment by Greg Studer [ 10/Apr/12 ]

> If #1 shard goes down - ##2-100 shards will become unavailable for the end clients -

This is somewhat true - any connections that are currently open will still be authenticated, and ideally the drivers support connection pooling so new connections aren't required for each request. Also, in any production setup we highly recommend replica set shards so that there's never a need to have a shard completely down. With the patch provided in SERVER-5406, a secondary should always be available for reading the auth info.

Agree in general though, authentication is a fairly new feature and we want to make sure we eliminate bottlenecks wherever possible.

Comment by Vasyl [ 10/Apr/12 ]

Hello Greg!

I believe this is actually works-as-designed if the full shard is down, we unfortunately can't authorize access to the db without the security info stored on the primary shard

i had same thoughts due to this issue

Users are stored on the primary shard so authentication can be done there
http://www.mongodb.org/display/DOCS/Security+and+Authentication#SecurityandAuthentication-ReplicaSetandShardingAuthentication

and i think it would be good idea to share system users among all shards, b/c of single point of failure
example: 100 shards (#1 shard is primary - system accounts stored on it) - if #1 shard goes down - ##2-100 shards will become unavailable for the end clients.

thanks
vasyl

Comment by Greg Studer [ 05/Apr/12 ]

Leaving open for fix of better messaging.

Comment by Greg Studer [ 05/Apr/12 ]

After more testing/reproducing last night as well, this is also works-as-designed when starting mongos without a replica set primary on the shard with authentication data. When first starting mongos, we need to be able to connect to a replica set primary in order to ensure we aren't seeing a stale minority of nodes that have been partitioned off the replica set. SERVER-5406 has a patch for the case when an existing mongos has a primary down, or a new mongos just has a secondary issue.

Comment by Greg Studer [ 05/Apr/12 ]

I believe this is actually works-as-designed if the full shard is down, we unfortunately can't authorize access to the db without the security info stored on the primary shard.

Generated at Thu Feb 08 03:08:32 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.