-
Type:
Bug
-
Resolution: Done
-
Priority:
Major - P3
-
None
-
Affects Version/s: 1.7.1
-
Component/s: None
-
None
-
ALL
Problem:
After dropping a shard and re-creating a shard with the same name, the following was seen in the logs
Sun Oct 10 20:37:43 [conn173932] DBException in process: setShardVersion failed!
{ "errmsg" : "exception: gotShardHost different than what i had before before [set3/rs3a:27018] got [set3/rs3a:27018,rs3b:27018] ", "code" : 13299, "ok" : 0 }Reproduce:
- turn the balancer off
db.settings.update( { _id : "balancer" },
Unknown macro: { $set }, true )
- create a 2 member replset, "foo"
- add the shard with a single member
db.runCommand( { addshard : "foo/node1", maxSize: 409600, name : "shard1" });
- remove the shard
db.runCommand( { removeshard : "foo/node1" });
- add the shard again, but with both nodes
db.runCommand( { addshard : "foo/node1,node2", maxSize: 409600, name : "shard1" });
Workaround:
Since the members of the shard were part of a replset, the following was performed to clear the error
- find the current master (through looking at the rs.status()
- for the current master, do a rs.stepDown()
- restart that mongod process
- repeat until all members of the replset had been re-started
Business Case:
- Reliability
Need to deal with stale meta-data more gracefully and automatically