Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-9711

make it impossible to have a wrong config server specification within a cluster

    • Type: Icon: Improvement Improvement
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Sharding
    • Labels:
      None

      I imagine via human error, typos, etc., especially when replacing a server, it is currently possible to have a cluster where there is not agreement on which config servers are the "right" ones. If this is already impossible lmk and we can close this ticket.

      For example suppose machines a,b,c are the config servers. we replace c with d. to do that one might put a copy of the a/b/c data (any) on d, and then switch everything over to use --configdb a,b,d. However i imagine there could be a window of time where some mongod or mongos's think a,b,c is authoritative and some think a,b,d is. We should assure that in said situation there are error messages logged an no mutations to a/b/c/d that land with a triplet of config servers that are inconsistent.

      I suppose if the config servers are a replica set, it is pretty hard to get the members out of sync. Perhaps that is one approach, also for the config servers to be a replica set some new functionality there would be needed to have the right transactional semantics. So that is one approach.

      Here is another idea:

      • each config server has an identity string for itself that is unique and persistent. as hostnames could be duplicated, maybe we put in /etc or somewhere a mongo.sig file with a GUID in it. we wouldn't want it in the data directory as that will be backed up and restored elsewhere and this is about the machine's identity.
      • then each machine in the cluster has a concept of who the three config servers are, and can ask them their signature. So we have the set CFG= {S1,S2,S3}

        that are the current config servers for the cluster. Operations on the config servers include this "here is who i think the config servers are" with them. The config server rejects the operation if the set isn't right. Perhaps even reads, writes for sure.

      Perhaps the config servers are the only ones who need to share this CFG signature set, if all config server mutations are done by the config servers themselves. Then the other members of the cluster just ask one of the config servers to do that operation. The other members need less intelligence on this then. They could in theory read from a phantom config server by mistake, but they couldn't do a write that isn't consistent among the three.

      Partial detection would be a good start if something is easy and could go into 2.5.

            Assignee:
            Unassigned Unassigned
            Reporter:
            dwight@mongodb.com Dwight Merriman
            Votes:
            1 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated:
              Resolved: