Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-16733

mongos does not fail when different configdb string is used

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Blocker - P1 Blocker - P1
    • 2.8.0-rc5
    • Affects Version/s: 2.7.8
    • Component/s: Sharding
    • None
    • Fully Compatible
    • ALL
    • Hide
      1. Start a minimal sharded environment, eg:
        mlaunch init --single --sharded 1 --config 3 --mongos 1
        

        In my case, this has a configdb string of "genique:27019,genique:27020,genique:27021".

      2. Start another mongos with a different configdb string, eg. reorder the config servers, or use FQDNs:
        mongos --port 27022 --configdb genique.local:27019,genique.local:27020,genique.local:27021
        mongos --port 27023 --configdb genique:27021,genique:27020,genique:27019
        
      3. Connect to the new mongos and write to the config db.
        $ mongo --port 27022 config
        MongoDB shell version: 2.6.6
        connecting to: 127.0.0.1:27022/config
        Server has startup warnings:
        2015-01-06T08:37:44.758+1100 I -
        2015-01-06T08:37:44.758+1100 I -        ** NOTE: This is a development version (2.7.8) of MongoDB.
        2015-01-06T08:37:44.758+1100 I -        **       Not recommended for production.
        2015-01-06T08:37:44.758+1100 I -
        > db.foo.insert({})
        WriteResult({ "nInserted" : 1 })
        

        This operation should fail, for example, the result with a 2.7.7 mongos is:

        $ mongo --port 27022 config
        MongoDB shell version: 2.6.6
        connecting to: 127.0.0.1:27022/config
        Server has startup warnings:
        2015-01-06T13:07:25.141+1100 I -
        2015-01-06T13:07:25.142+1100 I -        ** NOTE: This is a development version (2.7.7) of MongoDB.
        2015-01-06T13:07:25.142+1100 I -        **       Not recommended for production.
        2015-01-06T13:07:25.142+1100 I -
        > db.foo.insert({})
        WriteResult({
                "writeError" : {
                        "code" : 25,
                        "errmsg" : "could not verify config servers were active and reachable before write"
                }
        })
        

        (The above uses write commands, but the outcome is the same if legacy mode is used instead.)

      Show
      Start a minimal sharded environment, eg: mlaunch init --single --sharded 1 --config 3 --mongos 1 In my case, this has a configdb string of " genique:27019,genique:27020,genique:27021 ". Start another mongos with a different configdb string, eg. reorder the config servers, or use FQDNs: mongos --port 27022 --configdb genique.local:27019,genique.local:27020,genique.local:27021 mongos --port 27023 --configdb genique:27021,genique:27020,genique:27019 Connect to the new mongos and write to the config db. $ mongo --port 27022 config MongoDB shell version: 2.6.6 connecting to: 127.0.0.1:27022/config Server has startup warnings: 2015-01-06T08:37:44.758+1100 I - 2015-01-06T08:37:44.758+1100 I - ** NOTE: This is a development version (2.7.8) of MongoDB. 2015-01-06T08:37:44.758+1100 I - ** Not recommended for production. 2015-01-06T08:37:44.758+1100 I - > db.foo.insert({}) WriteResult({ "nInserted" : 1 }) This operation should fail, for example, the result with a 2.7.7 mongos is: $ mongo --port 27022 config MongoDB shell version: 2.6.6 connecting to: 127.0.0.1:27022/config Server has startup warnings: 2015-01-06T13:07:25.141+1100 I - 2015-01-06T13:07:25.142+1100 I - ** NOTE: This is a development version (2.7.7) of MongoDB. 2015-01-06T13:07:25.142+1100 I - ** Not recommended for production. 2015-01-06T13:07:25.142+1100 I - > db.foo.insert({}) WriteResult({ "writeError" : { "code" : 25, "errmsg" : "could not verify config servers were active and reachable before write" } }) (The above uses write commands, but the outcome is the same if legacy mode is used instead.)

      If mongos 2.7.8 or later is started with a configdb string that differs from other mongoses, it can successfully perform writes to the config db. This is possible even if the config servers specified by the mongos are partially or wholly disjoint from the actual/existing config servers. The expected behaviour (and actual behaviour in 2.7.7 and earlier) is that these writes fail (with "could not verify config servers were active and reachable before write").

      git bisect confirms that this is a regression caused by SERVER-15375, ie. f67afb4ff33bd803e93e2a52c0249cb872af680b is the breaking commit.

      The proximal cause is a failure of the mongos to call setShardVersion (SSV) (with init: true) when it connects to the config servers. This lack of SSV has been verified with logLevel 1 on the config servers. This means that the usual configdb string checking is not performed, and so the mongos is unaware that its configdb string differs from other mongoses.

      There should be a jstest to check that a mongos started with a different configdb string is unable to write to the configdb.

            Assignee:
            randolph@mongodb.com Randolph Tan
            Reporter:
            kevin.pulo@mongodb.com Kevin Pulo
            Votes:
            0 Vote for this issue
            Watchers:
            12 Start watching this issue

              Created:
              Updated:
              Resolved: