[SERVER-37510] ShardRegistry can have a shard host mapped to a config shard Created: 08/Oct/18  Updated: 06/Dec/22  Resolved: 17/Jan/20

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Randolph Tan Assignee: [DO NOT USE] Backlog - Sharding Team
Resolution: Won't Fix Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File repro.diff     File test.js    
Issue Links:
Related
Assigned Teams:
Sharding
Participants:

 Description   

if the host:port combination originally refers to a config server node and later a shard node.

To make it easier to catch this we should use unordered_map::insert when populating _hostLookup and add a warning if the insert was unsuccessful (when the host is aready in the map) here:
https://github.com/mongodb/mongo/blob/r4.1.3/src/mongo/s/client/shard_registry.cpp#L593

since this can mean that the host was part of another shard, otherwise it would have been removed here:
https://github.com/mongodb/mongo/blob/r4.1.3/src/mongo/s/client/shard_registry.cpp#L567



 Comments   
Comment by Ratika Gandhi [ 17/Jan/20 ]

Unlikely to happen in practice. 

Comment by Gregory McKeon (Inactive) [ 06/May/19 ]

kaloian.manassiev can we make this 4.2.0 and address it between the rc and GA?

Comment by Randolph Tan [ 09/Oct/18 ]

Attached test.js and repro.diff (based on v3.6 branch) that demonstrate that is possible to make a shard repl set node to be mislabeled 'config' in the ShardRegistry.

Sequence of events are as follows:
1. Original --configdb string is "config/A,B"
2. Config replica set is reconfigured to "config/A,C".
3. B is added as a new node to shard -> "s0/D,B"
4. ShardRegistry is updated to "D,B" via callback from RSM.
5. ShardRegistry reloads, B still maps to s0 in _hostLookup.
6. Next step in reload is to set the known config string as the "config" shard, so B now becomes mapped to "config".

Once this happens, commands will start hitting the "Surprised to discover that ... does not believe it is a config server" error.

Generated at Thu Feb 08 04:46:12 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.