Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-26761

old ReplicaSetMonitor can be used on config when adding new shard with same setName as recently removed shard

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.4.0-rc4
    • Component/s: Sharding
    • Labels:
      None
    • Backwards Compatibility:
      Fully Compatible
    • Operating System:
      ALL
    • Sprint:
      Sharding 2016-11-21
    • Linked BF Score:
      0

      Description

      The ReplicaSetMonitor is only synchronously (i.e., at the end of removeShard()) removed from the ReplicaSetMonitorManager on the mongos doing the removeShard().

      All other processes remove the ReplicaSetMonitor the next time they do a ShardRegistry::reload() (which in the worst case happens every 30 seconds) and notice the shard no longer exists in config.shards.

      If, after the removeShard(), a new shard is added with the same replica set name and the config server has not done a ShardRegistry::reload() yet, it will use the old shard's ReplicaSetMonitor to target the new shard (including for the addShard checks).

      This is because ReplicaSetMonitorManager::getOrCreateMonitor() indexes ReplicaSetMonitor instances by setName instead of some unique id:

      https://github.com/mongodb/mongo/blob/r3.4.0-rc1/src/mongo/client/replica_set_monitor_manager.cpp#L95-L99

      1) If the old shard is still up, the addShard() will (incorrectly) fail with error:

      "in seed list mySet/hostname:15516, host hostname:15516 does not belong to replica set mySet; found { hosts: [ \"hostname:15515\" ], setName: \"mySet\", setVersion: 1, ismaster: true, secondary: false, primary: \"hostname:15515\",  ..."
      

      2) If the old shard was shut down, by a lucky additional pair of bugs (see SERVER-26759 and SERVER-26760), the old ReplicaSetMonitor will be removed after the first HostUnreachable response for the old shard, a new ReplicaSetMonitor will be created on the retry, and the addShard will (correctly) succeed.

        Attachments

          Issue Links

            Activity

              People

              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: