[CSHARP-536] DNS round-robin support Created: 23/Jul/12 Updated: 17/Feb/17 Resolved: 03/Sep/14 |
|
| Status: | Closed |
| Project: | C# Driver |
| Component/s: | None |
| Affects Version/s: | 1.5 |
| Fix Version/s: | None |
| Type: | New Feature | Priority: | Major - P3 |
| Reporter: | Aristarkh Zagorodnikov | Assignee: | Unassigned |
| Resolution: | Won't Fix | Votes: | 2 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||
| Description |
|
It looks like the driver now has the framework to support multiple connections to shards in-place, so I would like to suggest allowing "seeding" the shard router with one DNS record that references multiple addresses leading to having multiple mongos connections for both load balancing and failover. While I guess (not tested yet), that specifying several different address might do the trick, having it use the ability of DNS to have multiple addresses for one name would be very good for administration purposes. |
| Comments |
| Comment by Aristarkh Zagorodnikov [ 04/Sep/14 ] |
|
Robert, thank you for the link, I'll check it out. |
| Comment by Aristarkh Zagorodnikov [ 04/Sep/14 ] |
|
Craig, I'm sorry for not paying attention to the original comment, yes, you addressed that case. |
| Comment by Robert Stam [ 04/Sep/14 ] |
|
Aristarkh, there is another JIRA ticket specifically about supporting mongos'es behind a load balancer: https://jira.mongodb.org/browse/CSHARP-1007 You can also comment or vote on that ticket. Currently supporting mongos'es behind a load balancer is considered very low priority and it is not likely to happen, even though there is no reason it couldn't be made to work. |
| Comment by Craig Wilson [ 04/Sep/14 ] |
|
No, I didn't miss it. I mentioned it in my original comment. The driver already handles failover between mongos, so that's not the concern. But the same problem still applies to mongos. First, we are performing health checks against each one we are monitoring and using that information to occasionally decide how to do a certain operation (For instance, add user uses a command in 2.6, but an insert in 2.4). If we can't reliably know what version of the server we are talking to, there will be random and finicky errors. Second, the query protocol for mongodb is stateful. It requires that we send OP_GETMOREs to the same server as the original request. Having a DNS round-robin or a load balancer in front first requires connection affinity and then requires that the driver checkout the connection for the duration of the query, however long that takes. This certainly makes connection pooling less robust and prevents scale. It is also something we are tentatively allowing a user to decide whether or not to do this as they know their applications needs better than we do. In any case, if the need is for fail-over, the driver natively provides that which we believe is sufficient. |
| Comment by Aristarkh Zagorodnikov [ 04/Sep/14 ] |
|
Excuse me, but I think you're missing an important case - mongos-only deployment. I believe that almost any large-scale MongoDB deployment ends up with sharding everything and doing non-admin jobs entirely through mongos. I still understand that it's far from being the general case |
| Comment by Craig Wilson [ 04/Sep/14 ] |
|
So, this would only be applicable to fail-over if all the servers were standalone servers and not part of a sharded cluster or a replica set. This is of limited value because we don't recommend people run standa alones in production. In addition, replica sets and sharded clusters already have fail-over support in the driver. |
| Comment by Aristarkh Zagorodnikov [ 04/Sep/14 ] |
|
I was actually more interested in the failover scenario, but I still understand the difficulties implementing this related to how much useful it would be to the general public. |
| Comment by Aristarkh Zagorodnikov [ 04/Sep/14 ] |
|
Many thanks for the detailed reply, Craig! |
| Comment by Craig Wilson [ 03/Sep/14 ] |
|
Hi Aristarkh, We've thought about this type of thing in the same context as using a load balancer between a pool of mongos's. Unfortunately, there simply isn't a good way to handle this for a couple of reasons. First, our query protocol is stateful, meaning we need to send the same OP_GETMORE request down the exact same connection. This is certainly doable, but presents different problems related to connection pooling. The second reason is that we need to monitor replica set members on an individual basis. Unfortunately, if we can't guarantee each connection for a server is for the actual server, then working in heterogenous versioned replica sets/sharded clusters presents a monitoring problem. It also presents the same problem with regards to deciding who is primary and should accept writes. We'll be closing this ticket for the above reasons. Feel free to leave a comment if you feel we've missed something. Thanks, |
| Comment by Aristarkh Zagorodnikov [ 03/Jul/14 ] |
|
Not if RR DNS would be done inside the driver itself (it does DNS lookup, caches all available addresses and then RRs between them). |
| Comment by Robert Stam [ 03/Jul/14 ] |
|
I don't think round-robin DNS would work currently for the same reason that mongos'es behind a load balancer doesn't work (i.e. the OP_GETMORE messages usually end up at the wrong server). |