[SERVER-11571] Better error message when attempting to add a shard that is a replica set that hasn't been initialized Created: 05/Nov/13  Updated: 06/Dec/22  Resolved: 07/Nov/19

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Sujay Mansingh Assignee: [DO NOT USE] Backlog - Sharding Team
Resolution: Done Votes: 0
Labels: ShardingRoughEdges, neweng
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Sharding
Operating System: ALL
Participants:

 Description   

If you have a group of mongod nodes that have been started but haven't yet had rs.initiate() run on them, then try to add those nodes as a a replica set shard, you just get back a socket exception with no information about the true nature of the problem.

Example:

mongos> sh.addShard("rs_shard6/shard6-ovh-01.edtd.net:27017,shard6-ovh-02.edtd.net:27017")
{
	"ok" : 0,
	"errmsg" : "couldn't connect to new shard socket exception [CONNECT_ERROR] for rs_shard6/shard6-ovh-01.edtd.net:27017,shard6-ovh-02.edtd.net:27017"
}

From the logs:

Tue Nov  5 10:58:59.111 [conn8] starting new replica set monitor for replica set rs_shard6 with seed of shard6-ovh-01.edtd.net:27017,shard6-ovh-02.edtd.net:27017
Tue Nov  5 10:58:59.169 [conn8] successfully connected to seed shard6-ovh-01.edtd.net:27017 for replica set rs_shard6
Tue Nov  5 10:58:59.175 [conn8] warning: node: shard6-ovh-01.edtd.net:27017 isn't a part of set: rs_shard6 ismaster: { ismaster: false, secondary: false, info: "can't get local.system.replset config from self or any seed (EMPTYCONFIG)", isreplicaset: true, maxBsonObjectSize: 16777216, maxMessageSizeBytes: 48000000, localTime: new Date(1383649139157), ok: 1.0 }
Tue Nov  5 10:58:59.247 [conn8] successfully connected to seed shard6-ovh-02.edtd.net:27017 for replica set rs_shard6
Tue Nov  5 10:58:59.258 [conn8] warning: node: shard6-ovh-02.edtd.net:27017 isn't a part of set: rs_shard6 ismaster: { ismaster: false, secondary: false, info: "can't get local.system.replset config from self or any seed (EMPTYCONFIG)", isreplicaset: true, maxBsonObjectSize: 16777216, maxMessageSizeBytes: 48000000, localTime: new Date(1383649139253), ok: 1.0 }
Tue Nov  5 10:59:01.258 [conn8] warning: No primary detected for set rs_shard6
Tue Nov  5 10:59:01.258 [conn8] All nodes for set rs_shard6 are down. This has happened for 1 checks in a row. Polling will stop after 29 more failed checks
Tue Nov  5 10:59:01.258 [conn8] replica set monitor for replica set rs_shard6 started, address is rs_shard6/
Tue Nov  5 10:59:01.258 [conn8] deleting replica set monitor for: rs_shard6/
Tue Nov  5 10:59:01.258 [conn8] addshard request { addShard: "rs_shard6/shard6-ovh-01.edtd.net:27017,shard6-ovh-02.edtd.net:27017" } failed: couldn't connect to new shard socket exception [CONNECT_ERROR] for rs_shard6/shard6-ovh-01.edtd.net:27017,shard6-ovh-02.edtd.net:27017



 Comments   
Comment by Sheeri Cabral (Inactive) [ 07/Nov/19 ]

In current version this fails with "failed to satisfy readPreference" because none of the reachable instances are primary.

Comment by Sujay Mansingh [ 05/Nov/13 ]

Looks like I needed to call `rs.initialise` on one of the nodes and then `rs.add(<path-to-other-node>)`.

Now my replica-set has been added.
I hope it rebalances.

Generated at Thu Feb 08 03:26:10 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.