[SERVER-9711] make it impossible to have a wrong config server specification within a cluster Created: 16/May/13 Updated: 07/Mar/14 Resolved: 07/Mar/14 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Dwight Merriman | Assignee: | Unassigned |
| Resolution: | Done | Votes: | 1 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||
| Participants: | |||||||||||||||||
| Description |
|
I imagine via human error, typos, etc., especially when replacing a server, it is currently possible to have a cluster where there is not agreement on which config servers are the "right" ones. If this is already impossible lmk and we can close this ticket. For example suppose machines a,b,c are the config servers. we replace c with d. to do that one might put a copy of the a/b/c data (any) on d, and then switch everything over to use --configdb a,b,d. However i imagine there could be a window of time where some mongod or mongos's think a,b,c is authoritative and some think a,b,d is. We should assure that in said situation there are error messages logged an no mutations to a/b/c/d that land with a triplet of config servers that are inconsistent. I suppose if the config servers are a replica set, it is pretty hard to get the members out of sync. Perhaps that is one approach, also for the config servers to be a replica set some new functionality there would be needed to have the right transactional semantics. So that is one approach. Here is another idea:
Perhaps the config servers are the only ones who need to share this CFG signature set, if all config server mutations are done by the config servers themselves. Then the other members of the cluster just ask one of the config servers to do that operation. The other members need less intelligence on this then. They could in theory read from a phantom config server by mistake, but they couldn't do a write that isn't consistent among the three. Partial detection would be a good start if something is easy and could go into 2.5. |
| Comments |
| Comment by Greg Studer [ 07/Mar/14 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||
|
As scott mentioned, this is WAD - shards will only accept requests from mongoses where the config string is exactly consistent. There's no way to prevent shards from getting contacted by particular inconsistent mongoses, but we may also allow shards to be started with explicit config information and not rely on the first mongos to populate it, but this is tracked separately (related to auth info as well). | |||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Scott Hernandez (Inactive) [ 17/May/13 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Yes, as I stated, it is the shards which reject mongos requests with the incorrect configdb string. | |||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Dwight Merriman [ 17/May/13 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||
|
so you seem to be right at least partially. i tried this:
however this did start:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Scott Hernandez (Inactive) [ 16/May/13 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Dwight, the order/string of the config servers (configdb param to mongos) cannot be changes once set in a running sharded cluster. If you do what you are suggesting you will get an error when you try to start a mongos and it tries to connect to an existing shard(s). In essence it is impossible to run with two different configdb strings. Is there an example of user error or a use-case which you have seen where something like this happened without an error? |