[SERVER-15854] Improve logs when calling removeShard on a removed shard Created: 29/Oct/14  Updated: 18/Sep/15  Resolved: 13/Mar/15

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 2.6.4
Fix Version/s: 3.1.0

Type: Improvement Priority: Minor - P4
Reporter: Victor Hooi Assignee: Daniel Alabi
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
Backwards Compatibility: Fully Compatible
Participants:

 Description   

The procedure for removing a shard from a sharded cluster is to run the removeShard command twice.

The first invocation will start draining the shard.

Once that is complete, a second invocation will then remove the shard.

However, there is not progress given for when the drain is complete. This makes it quite possibly you may have to run the removeShard command several times, in order to see whether it starts the remove or not.

Also, if you happen to run the removeShard command once the removal is complete, it prints a assertion message:

mongos> db.runCommand(
{ removeShard: "foobar" }
)
{ "code" : 13129, "ok" : 0, "errmsg" : "exception: can't find shard for: foobar" }

as well as a stack trace:

2014-09-19T19:08:19.071+0000 [conn139] Assertion: 13129:can't find shard for: foobar
2014-09-19T19:08:19.107+0000 [conn139] 0xdbc1f1 0xd66309 0xd49536 0xd49a8c 0xc62b78 0xc5a354 0xb764be 0xc5353a 0xb8c9dc 0xc7cae1 0xc528ff 0x6e6f29 0xd714ce 0x39d30079d1 0x39d2ce8b5d
 /opt/mongod27017/mongod/bin/mongos(_ZN5mongo15printStackTraceERSo+0x21) [0xdbc1f1]
 /opt/mongod27017/mongod/bin/mongos(_ZN5mongo10logContextEPKc+0x159) [0xd66309]
 /opt/mongod27017/mongod/bin/mongos(_ZN5mongo11msgassertedEiPKc+0xe6) [0xd49536]
 /opt/mongod27017/mongod/bin/mongos() [0xd49a8c]
 /opt/mongod27017/mongod/bin/mongos(_ZN5mongo15StaticShardInfo4findERKSs+0x358) [0xc62b78]
 /opt/mongod27017/mongod/bin/mongos(_ZN5mongo5Shard5resetERKSs+0x34) [0xc5a354]
 /opt/mongod27017/mongod/bin/mongos(_ZN5mongo11dbgrid_cmds14RemoveShardCmd3runERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x16e) [0xb764be]
 /opt/mongod27017/mongod/bin/mongos(_ZN5mongo7Command22execCommandClientBasicEPS0_RNS_11ClientBasicEiPKcRNS_7BSONObjERNS_14BSONObjBuilderEb+0x2ba) [0xc5353a]
 /opt/mongod27017/mongod/bin/mongos(_ZN5mongo7Command20runAgainstRegisteredEPKcRNS_7BSONObjERNS_14BSONObjBuilderEi+0x32c) [0xb8c9dc]
 /opt/mongod27017/mongod/bin/mongos(_ZN5mongo8Strategy15clientCommandOpERNS_7RequestE+0x561) [0xc7cae1]
 /opt/mongod27017/mongod/bin/mongos(_ZN5mongo7Request7processEi+0x6ef) [0xc528ff]
 /opt/mongod27017/mongod/bin/mongos(_ZN5mongo21ShardedMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0x69) [0x6e6f29]
 /opt/mongod27017/mongod/bin/mongos(_ZN5mongo17PortMessageServer17handleIncomingMsgEPv+0x4ee) [0xd714ce]
 /lib64/libpthread.so.0() [0x39d30079d1]
 /lib64/libc.so.6(clone+0x6d) [0x39d2ce8b5d]

Please could we handle this condition more gracefully? (Either by making the assertion message a bit friendlier and/or not dumping out the stack trace like that).



 Comments   
Comment by Githook User [ 13/Mar/15 ]

Author:

{u'username': u'alabid', u'name': u'Daniel Alabi', u'email': u'alabidan@gmail.com'}

Message: SERVER-15854 Use ShardNotFound error code for deleted shard
Branch: master
https://github.com/mongodb/mongo/commit/8b1c4e3824fb63a70f9dbb9522176bbf4a4913df

Comment by Githook User [ 11/Mar/15 ]

Author:

{u'username': u'alabid', u'name': u'Daniel Alabi', u'email': u'alabidan@gmail.com'}

Message: SERVER-15854 Don't massert when checking for deleted shard
Branch: master
https://github.com/mongodb/mongo/commit/ceb58fee5495ce2b45f6aaba7aa3da76b9cdf8d9

Comment by Andy Schwerin [ 10/Mar/15 ]

alabid, please remove the stack trace. You might be able to replace the use of Shard::make with Shard::findIfExists. If the shard doesn't exist, you can log a more informative message and also return the more informative message in the command message. Talk to renctan about whether that approach is sound.

Generated at Thu Feb 08 03:39:11 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.