[SERVER-2879] Upgrade 1.6 to 1.8 changes shard member format requirements, cryptic error Created: 31/Mar/11  Updated: 12/Jul/16  Resolved: 04/May/11

Status: Closed
Project: Core Server
Component/s: Replication, Sharding
Affects Version/s: 1.8.0
Fix Version/s: 1.9.1

Type: Bug Priority: Major - P3
Reporter: James Pharaoh Assignee: Eliot Horowitz (Inactive)
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Operating System: ALL
Participants:

 Description   

When upgrading from 1.6.4 to 1.8.0, which is supposed to be a drop-in replacement, mongos isn't connecting to members correctly. Various errors are seen, both in the logs and in the command line client, pointing to line 231 in shard.cpp.

This turns out to be because the shards are set up in the config database without a name before the list of hosts in each replica set. I had to change this:

{ "_id" : "27019", "host" : "gold.private:27019,silver.private:27019" } { "_id" : "27020", "host" : "tin.private:27020,lead.private:27020" }

to this:

{ "_id" : "27019", "host" : "27019/gold.private:27019,silver.private:27019" } { "_id" : "27020", "host" : "27020/tin.private:27020,lead.private:27020" }

Firstly, the error message I got for this is extremely lacking. If there is something wrong with the string here then it should be pretty easy to tell the user that when the server starts up.

Secondly, why has the behaviour changed? It seems like the id could easily be used to name the shard in this case. Where has this new requirement come from? It seems a bit dubious, due to breaking the upgrade from 1.6 to 1.8 which is supposed to be "drop in".

Mailing list thread: http://groups.google.com/group/mongodb-user/browse_thread/thread/8a5562671b3b3bd9



 Comments   
Comment by auto [ 04/May/11 ]

Author:

{u'login': u'erh', u'name': u'Eliot Horowitz', u'email': u'eliot@10gen.com'}

Message: fix error message for invalid host format SERVER-2879
Branch: master
https://github.com/mongodb/mongo/commit/3741503bec716a54f8fe3386096628d1accbbf7b

Comment by Eliot Horowitz (Inactive) [ 31/Mar/11 ]

The reason for the change is sanity checking.
So that we know what replica set name you are expecting, and can check that to make sure we're talking to the right servers.
All to prevent logical corruption.

Generated at Thu Feb 08 03:01:27 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.