Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Done
Priority: Major - P3
Fix Version/s: 3.4.0-rc4
Affects Version/s: None
Component/s: Sharding
Labels:
None

Backwards Compatibility:
Fully Compatible
Operating System:
ALL
Sprint:
Sharding 2016-11-21
Linked BF Score:
0
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

The ReplicaSetMonitor is only synchronously (i.e., at the end of removeShard()) removed from the ReplicaSetMonitorManager on the mongos doing the removeShard().

All other processes remove the ReplicaSetMonitor the next time they do a ShardRegistry::reload() (which in the worst case happens every 30 seconds) and notice the shard no longer exists in config.shards.

If, after the removeShard(), a new shard is added with the same replica set name and the config server has not done a ShardRegistry::reload() yet, it will use the old shard's ReplicaSetMonitor to target the new shard (including for the addShard checks).

This is because ReplicaSetMonitorManager::getOrCreateMonitor() indexes ReplicaSetMonitor instances by setName instead of some unique id:

https://github.com/mongodb/mongo/blob/r3.4.0-rc1/src/mongo/client/replica_set_monitor_manager.cpp#L95-L99

1) If the old shard is still up, the addShard() will (incorrectly) fail with error:

"in seed list mySet/hostname:15516, host hostname:15516 does not belong to replica set mySet; found { hosts: [ \"hostname:15515\" ], setName: \"mySet\", setVersion: 1, ismaster: true, secondary: false, primary: \"hostname:15515\",  ..."

2) If the old shard was shut down, by a lucky additional pair of bugs (see ~~SERVER-26759~~ and ~~SERVER-26760~~), the old ReplicaSetMonitor will be removed after the first HostUnreachable response for the old shard, a new ReplicaSetMonitor will be created on the retry, and the addShard will (correctly) succeed.

is depended on by

SERVER-26785 rewrite addshard2.js to be able to unblacklist it from the last_stable suite

Closed

Assignee:: Esha Maharishi (Inactive)
Reporter:: Esha Maharishi (Inactive)
Participants:: Esha Maharishi, Githook User
Votes:: 0 Vote for this issue
Watchers:: 4 Start watching this issue

Created:: Oct 25 2016 02:48:49 PM UTC
Updated:: Nov 19 2016 12:06:22 AM UTC
Resolved:: Nov 14 2016 09:35:06 PM UTC
Confidence Status Last Update:: 11/Nov/16 9:29 PM

Details

Description

Attachments

Issue Links

Forms

Activity

People

Dates