[SERVER-1594] mongos should work with replica-set (without sharding) Created: 09/Aug/10  Updated: 16/Jan/24

Status: Blocked
Project: Core Server
Component/s: Replication, Sharding
Affects Version/s: 1.6.0
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Scott Hernandez (Inactive) Assignee: Backlog - Catalog and Routing
Resolution: Unresolved Votes: 146
Labels: oldshardingemea
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Related
is related to TOOLS-487 MongoDB proxy, like MongoS but withou... Closed
Assigned Teams:
Catalog and Routing
Participants:

 Description   

Change the mongos router to work with a non-sharding replica set.



 Comments   
Comment by Alexander Komyagin [ 27/Sep/19 ]

I built a prototype for this functionality (not supported): https://github.com/alex-thc/monguac/tree/new

Comment by jagadeep suggala [ 18/Mar/19 ]

+1

Comment by ttnn [ 27/Mar/18 ]

+1

Comment by Donald Scott [ 13/Jan/17 ]

Fundamentally there needs to be an option to load balance across a replica-set WITHOUT sharding. Many people have workloads which are too small to warrant sharding, yet require replica sets for redundancy. There people then have maintenance and reporting scripts which would permit stale data, but we have no mechanism to intelligently route the query to the most available server.

Comment by Dominic Baines [ 02/Jun/16 ]

Nothing in ages is this still being looked at? This would be an extremely useful feature.

Comment by Chuck Desmarais [ 30/Jun/15 ]

+1 we got around it creating a single shard cluster, but hadn't really wanted to do that, This feature would have saved us an annoying trip into driver edge cases.

Comment by Thijs Cadier [ 22/Oct/14 ]

+1, this would be extremely useful to us.

Comment by David Henderson [ 07/Oct/14 ]

I've cross-posted this response to a blog post here: http://emptysqua.re/blog/server-discovery-and-monitoring-spec/

I think this view (from the previous comment, that there is little value to more widespread use of mongos) is a little short-sighted - as far as I can see, there are other benefits of having this logic in the mongos (as well as removing the complexity from the driver developers):

For ease, I'm going to refer throughout to a mongo instance (i.e. a single mongod / a replicaset / a sharded cluster) as a "target".

  • If you have multiple processes on one app server (possibly different languages too), each connecting to a number of targets, having a local mongos that dealt with connection routing / pooling should cut down the connection count / non-mapped VM on the mongo servers. Where you have multiple targets, you could have multiple local mongos on different ports. Each app process would connect on the right port for the relevant targets it wants to reach. The mongos's could then manage the routing / pooling in a more efficient manner.
  • If you have the mongos as a router / pooling manager, the the app layer does not need to have any knowledge of what it is connecting to. There should be no good reason for it to care if the server it is connecting to is, for example, a replica set vs a mongod - In python, you would be connecting with MongoReplicaSetClient vs MongoClient - requiring app code changes if the target changes.
  • Another benefit of using the mongos in this way is to take the fine details out of the app layer, and move it into the ops space. Currently, if you're using the mongos in a sharded environment, you can simply connect to mongos and have no knowledge of where the cluster / shards are - as this is all dealt with between the mongos/config servers.
  • It would ease the transition from mongod -> RS -> Cluster (especially in a mixed environment). If you decide that you need to change one replicaset to a sharded cluster, you currently would have to:
    1) Change the code to use the correct connection mechanism (for that target only)
    2) Add a mongos to each app box for that target
    3) Add 3 config servers to manage that target.

If, on the other hand, you had the option to connect to anything via a mongos, as a standard layer, you would only have to:
1) Run up & configure your config servers
2) Change your mongos config so that it pointed at the configservers rather than the at the replicaset seeds.

For some background, we're running with:

  • A number of clusters, each with a single shard (i.e sharding effectively off). The shard has a replicaset with a primary and a number of secondaries. Each "cluster" has the required 3 configsvrs for production usage (which is a pain in terms of management, especially since each new cluster means 3 more configsvrs to deploy in production. They quickly add up)
  • Each app server has a mongos for each "cluster" e.g. cluster 1 on 27017 cluster2 on 27027 etc
  • Each process makes connection(s) to the relevant mongos ports based on which clusters it needs to query at any point.

In conclusion, here's my wishlist for an ideal mongos solution:

  • mongos could optionally take 3 config options: single mongod address / replicaset seeds / configserver details
  • a single mongos process could have multiple of the above configured, listening on different ports (saves overheads of running multiple mongos's)
  • the drivers could talk to the mongos in a consistent manner for all types behind it (as they currently do for sharded cluster). The mongos would deal with connection pooling / readPreference routing / HA / write concerns in a stable and consistent way.

Any thoughts would be appreciated.

Comment by Sebastian Riedel [ 07/Sep/14 ]

There is so much unnecessary work going into replica set support (especially read preferences) for all the different drivers, it boggles the mind. Such an important part of the infrastructure should really be owned by the core server.

Comment by Abhishek Kona [ 27/Feb/14 ]

This kind of critical infrastructure should ideally be owned by Mongo. Is this in the roadmap for 2.6?

Comment by Glenn Maynard [ 24/Jan/14 ]

Big +1. I wasted a lot of time today bringing up a server because of replica set driver weirdness. (Pymongo requires the replica set name in the URL for some reason and my Mongo host doesn't include it. Pymongo also has a whole separate top-level class just for replica sets.) Users and drivers shouldn't have to care whether they're connecting to a standalone server, a replica set or a cluster, and mongos could mask all this and eliminate whole categories of driver bugs and usage errors.

Another example is how easy it is to end up using a non-replica-set mongo URL when connecting to a replica set, and this will "seem to work" for a long time if it happens to be the primary, then suddenly fail later after a failover. This can be hidden within mongos and handled automatically.

Comment by Thomas Parrott [ 15/Jan/14 ]

This would be a really useful feature, as we can let the mongos deal with maintaining persistent connections to the replica sets, meaning PHP/Apache clients dont have a delay whilst they connect to all the nodes.

Comment by Abhishek Kona [ 30/Dec/13 ]

I would like to use mongos as a connection proxy to save on the number of connections it can make to mongo.

Comment by Stephan Bösebeck [ 28/May/13 ]

Need this feature, too... very much so.
We also have a lot of shell-scripts in our environment - shell does not do fail overs.

Comment by Philippe David [ 12/Mar/13 ]

A lot of efforts have been done to code the replica set connection/reconnection logic in several drivers.
I have been using the PHP driver in production for nearly 2 years now, and we experienced significant trouble in degraded situations. There have been a lot of work on this driver but yet, we are in 2013 and things like the connection timeout have only been fixed very recently.

The client-side replica set code is indeed complex, but even for a language like PHP which is quite popular I am not yet satisfied by the driver. To me it sounds very reasonnable to have this complexity only once in a generic proxy and not in every driver. If it means running a mongos daemon on localhost for every PHP client, fine. As long as it's more stable.

+1

Comment by Thibault Meyer [ 26/Jan/13 ]

So i'm looking for this kind of feature today. You have my vote too.

Comment by Marco Lovato [ 21/Nov/12 ]

We are just about to install MongoDB for our production site (replacing Oracle) and we did a lot of testing using Mongos, but our production site will definetivelly use replica-set. Then we found this.
Again, is it still planned? It has been more then 2 years since first post, and its considered small (<1day).

Best
Marco

Comment by S Porcina [ 29/Mar/12 ]

The feature gets my vote too.

Comment by Alex Simenduev [ 22/Jan/12 ]

Is this still planned? When to expect for this future, I'm desperately waiting for this future...

Comment by Markus Gattol [ 21/Nov/10 ]

Resolving this one would also enable the case where nginx-gridfs could read from secondaries as opposed to primaries only:

Generated at Thu Feb 08 02:57:28 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.