[SERVER-16539] Cannot add shard due to mongos expanding the shard host to include all members Created: 12/Dec/14  Updated: 21/Jun/19  Resolved: 21/Jun/19

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 2.8.0-rc2
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Michael Grundy Assignee: Mira Carey
Resolution: Won't Fix Votes: 0
Labels: 28qa
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
Tested
Operating System: ALL
Participants:

 Description   

Adding a large (45 member) replica set as a shard fails because every discovered node is used in the key. If you add a smaller replica set and expand it later, it don

mongos> db.runCommand({addShard:"bigRepl1/ec2-54-175-35-158.compute-1.amazonaws.com:27017", name:"ShardBigRepl1"})
{
	"ok" : 0,
	"errmsg" : "Btree::insert: key too large to index, failing config.shards.$host_1 2163 { : \"bigRepl1/ec2-54-165-103-121.compute-1.amazonaws.com:27017,ec2-54-165-239-161.compute-1.amazonaws.com:27017,ec2-54-165-41-0.compute-1.amazonaws.com:270...\" }"
}

Config Server:

2014-12-12T21:17:37.670+0000 D WRITE    [conn11]  Caught Assertion in insert, continuing  :: caused by :: Btree::insert: key too large to index, failing config.shards.$host_1 2163 { : "bigRepl1/ec2-54-165-103-121.compute-1.amazonaws.com:27017,ec2-54-165-239-161.compute-1.amazonaws.com:27017,ec2-54-165-41-0.compute-1.amazonaws.com:270..." }
2014-12-12T21:17:37.670+0000 I WRITE    [conn11] insert config.shards query: { _id: "ShardBigRepl1", host: "bigRepl1/ec2-54-165-103-121.compute-1.amazonaws.com:27017,ec2-54-165-239-161.compute-1.amazonaws.com:27017,ec2-54-165-41-0.compute-1.amazonaws.com:270..." } ninserted:0 keyUpdates:0 exception: Btree::insert: key too large to index, failing config.shards.$host_1 2163 { : "bigRepl1/ec2-54-165-103-121.compute-1.amazonaws.com:27017,ec2-54-165-239-161.compute-1.amazonaws.com:27017,ec2-54-165-41-0.compute-1.amazonaws.com:270..." } code:17280 numYields:0  0ms
2014-12-12T21:17:37.670+0000 I QUERY    [conn11] command config.$cmd command: insert { insert: "shards", documents: [ { _id: "ShardBigRepl1", host: "bigRepl1/ec2-54-165-103-121.compute-1.amazonaws.com:27017,ec2-54-165-239-161.compute-1.amazonaws.com:27017,ec2-54-165-41-0.compute-1.amazonaws.com:270..." } ] } ntoreturn:1 keyUpdates:0 numYields:0  reslen:351 0ms

mongos log:

2014-12-12T21:17:37.667+0000 I SHARDING [conn1] going to add shard: { _id: "ShardBigRepl1", host: "bigRepl1/ec2-54-165-103-121.compute-1.amazonaws.com:27017,ec2-54-165-239-161.compute-1.amazonaws.com:27017,ec2-54-165-41-0.compute-1.amazonaws.com:270..." }
2014-12-12T21:17:37.670+0000 I SHARDING [conn1] error adding shard: { _id: "ShardBigRepl1", host: "bigRepl1/ec2-54-165-103-121.compute-1.amazonaws.com:27017,ec2-54-165-239-161.compute-1.amazonaws.com:27017,ec2-54-165-41-0.compute-1.amazonaws.com:270..." } err: Btree::insert: key too large to index, failing config.shards.$host_1 2163 { : "bigRepl1/ec2-54-165-103-121.compute-1.amazonaws.com:27017,ec2-54-165-239-161.compute-1.amazonaws.com:27017,ec2-54-165-41-0.compute-1.amazonaws.com:270..." }
2014-12-12T21:17:37.670+0000 I COMMAND  [conn1] addshard request { addShard: "bigRepl1/ec2-54-175-35-158.compute-1.amazonaws.com:27017", name: "ShardBigRepl1" } failed: Btree::insert: key too large to index, failing config.shards.$host_1 2163 { : "bigRepl1/ec2-54-165-103-121.compute-1.amazonaws.com:27017,ec2-54-165-239-161.compute-1.amazonaws.com:27017,ec2-54-165-41-0.compute-1.amazonaws.com:270..." }

Interestingly, after the shard fails to add, the ReplicaSetMonitorWatcher thread still polls that replicaset

2014-12-12T21:17:47.652+0000 D NETWORK  [ReplicaSetMonitorWatcher] checking replica set: bigRepl1
2014-12-12T21:17:47.652+0000 D NETWORK  [ReplicaSetMonitorWatcher] creating new connection to:ec2-54-175-35-190.compute-1.amazonaws.com:27017
2014-12-12T21:17:47.653+0000 D COMMAND  [ConnectBG] BackgroundJob starting: ConnectBG
2014-12-12T21:17:47.655+0000 D NETWORK  [ReplicaSetMonitorWatcher] connected to server ec2-54-175-35-190.compute-1.amazonaws.com:27017 (10.93.20.167)
2014-12-12T21:17:47.655+0000 D NETWORK  [ReplicaSetMonitorWatcher] connected connection!
2014-12-12T21:17:47.655+0000 D SHARDING [ReplicaSetMonitorWatcher] checking wire version of new connection ec2-54-175-35-190.compute-1.amazonaws.com:27017 (10.93.20.167)
2014-12-12T21:17:47.658+0000 D NETWORK  [ReplicaSetMonitorWatcher] Updating host ec2-54-175-35-190.compute-1.amazonaws.com:27017 based on ismaster reply: { setName: "bigRepl1", setVersion: 1, ismaster: false, secondary: true, hosts: [ "ec2-54-175-35-158.compute-1.amazonaws.com:27017", "ec2-54-175-29-82.compute-1.amazonaws.com:27017", "ec2-54-175-24-203.compute-1.amazonaws.com:27017", "ec2-54-175-35-179.compute-1.amazonaws.com:27017", "ec2-54-165-41-0.compute-1.amazonaws.com:27017", "ec2-54-175-32-37.compute-1.amazonaws.com:27017", "ec2-54-175-34-65.compute-1.amazonaws.com:27017", "ec2-54-175-22-74.compute-1.amazonaws.com:27017", "ec2-54-175-24-188.compute-1.amazonaws.com:27017", "ec2-54-175-35-232.compute-1.amazonaws.com:27017", "ec2-54-175-35-98.compute-1.amazonaws.com:27017", "ec2-54-175-33-79.compute-1.amazonaws.com:27017", "ec2-54-175-27-176.compute-1.amazonaws.com:27017", "ec2-54-175-32-80.compute-1.amazonaws.com:27017", "ec2-54-175-38-136.compute-1.amazonaws.com:27017", "ec2-54-175-41-114.compute-1.amazonaws.com:27017", "ec2-54-175-32-125.compute-1.amazonaws.com:27017", "ec2-54-165-239-161.compute-1.amazonaws.com:27017", "ec2-54-175-41-209.compute-1.amazonaws.com:27017", "ec2-54-172-133-172.compute-1.amazonaws.com:27017", "ec2-54-165-103-121.compute-1.amazonaws.com:27017", "ec2-54-175-35-184.compute-1.amazonaws.com:27017", "ec2-54-175-41-193.compute-1.amazonaws.com:27017", "ec2-54-175-38-134.compute-1.amazonaws.com:27017", "ec2-54-175-29-84.compute-1.amazonaws.com:27017", "ec2-54-175-29-228.compute-1.amazonaws.com:27017", "ec2-54-175-33-122.compute-1.amazonaws.com:27017", "ec2-54-175-35-107.compute-1.amazonaws.com:27017", "ec2-54-175-29-83.compute-1.amazonaws.com:27017", "ec2-54-175-32-139.compute-1.amazonaws.com:27017", "ec2-54-175-28-154.compute-1.amazonaws.com:27017", "ec2-54-175-26-165.compute-1.amazonaws.com:27017", "ec2-54-175-24-4.compute-1.amazonaws.com:27017", "ec2-54-175-24-9.compute-1.amazonaws.com:27017", "ec2-54-175-25-247.compute-1.amazonaws.com:27017", "ec2-54-175-33-71.compute-1.amazonaws.com:27017", "ec2-54-175-39-18.compute-1.amazonaws.com:27017", "ec2-54-175-35-190.compute-1.amazonaws.com:27017", "ec2-54-175-35-100.compute-1.amazonaws.com:27017", "ec2-54-175-38-75.compute-1.amazonaws.com:27017", "ec2-54-174-77-58.compute-1.amazonaws.com:27017", "ec2-54-175-28-22.compute-1.amazonaws.com:27017", "ec2-54-175-36-50.compute-1.amazonaws.com:27017", "ec2-54-175-35-147.compute-1.amazonaws.com:27017", "ec2-54-175-35-141.compute-1.amazonaws.com:27017" ], primary: "ec2-54-175-35-158.compute-1.amazonaws.com:27017", me: "ec2-54-175-35-190.compute-1.amazonaws.com:27017", rbid: 1475283265, maxBsonObjectSize: 16777216, maxMessageSizeBytes: 48000000, maxWriteBatchSize: 1000, localTime: new Date(1418419067661), maxWireVersion: 3, minWireVersion: 0, ok: 1.0 }



 Comments   
Comment by Mira Carey [ 21/Jun/19 ]

We've closed this wontfix, both because:

  1. I'm unsure if this is a problem any longer. The original problem had to do with limits on indexed value size, and we no longer have those limits
  2. it's been 5 years without a fix and we've been able to avoid addressing this.

If you still encounter this issue, feel free to file a new ticket

Comment by Andy Schwerin [ 06/Oct/15 ]

I believe that the index on the host field was put in as a way to avoid adding the same host as multiple shards. However, this is easily circumvented with cnames, or in the case of replica sets, by rearranging the seed list orders, since they are not canonical. I think it would be safe and advisable to drop this index, and if we want to avoid double-adding the same replica set as multiple shards, we should address that problem more comprehensively.

Generated at Thu Feb 08 03:41:22 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.