[SERVER-26761] old ReplicaSetMonitor can be used on config when adding new shard with same setName as recently removed shard Created: 25/Oct/16 Updated: 19/Nov/16 Resolved: 14/Nov/16 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | None |
| Fix Version/s: | 3.4.0-rc4 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Esha Maharishi (Inactive) | Assignee: | Esha Maharishi (Inactive) |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||
| Operating System: | ALL | ||||||||
| Sprint: | Sharding 2016-11-21 | ||||||||
| Participants: | |||||||||
| Linked BF Score: | 0 | ||||||||
| Description |
|
The ReplicaSetMonitor is only synchronously (i.e., at the end of removeShard()) removed from the ReplicaSetMonitorManager on the mongos doing the removeShard(). All other processes remove the ReplicaSetMonitor the next time they do a ShardRegistry::reload() (which in the worst case happens every 30 seconds) and notice the shard no longer exists in config.shards. If, after the removeShard(), a new shard is added with the same replica set name and the config server has not done a ShardRegistry::reload() yet, it will use the old shard's ReplicaSetMonitor to target the new shard (including for the addShard checks). This is because ReplicaSetMonitorManager::getOrCreateMonitor() indexes ReplicaSetMonitor instances by setName instead of some unique id: 1) If the old shard is still up, the addShard() will (incorrectly) fail with error:
2) If the old shard was shut down, by a lucky additional pair of bugs (see |
| Comments |
| Comment by Githook User [ 14/Nov/16 ] |
|
Author: {u'username': u'EshaMaharishi', u'name': u'Esha Maharishi', u'email': u'esha.maharishi@mongodb.com'}Message: |
| Comment by Esha Maharishi (Inactive) [ 25/Oct/16 ] |
|
Potential fix: call ReplicaSetMonitor::remove() for the removed shard in the OpObserver for removes to config.shards. |