[SERVER-3683] Possible for setShardVersion to never be set on mongod after multiple StaleConfigExceptions due to stale/missing mongod metadata Created: 24/Aug/11  Updated: 11/Jul/16  Resolved: 16/Sep/11

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 1.8.3
Fix Version/s: 1.8.4, 2.0.0-rc2

Type: Bug Priority: Major - P3
Reporter: Greg Studer Assignee: Eliot Horowitz (Inactive)
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Related
related to SERVER-3889 Possible for setShardVersion to never... Closed
is related to SERVER-4118 mongos causes dos by opening a ton of... Closed
Operating System: ALL
Participants:

 Description   

...possibly also affects 1.9/2.0.

Core issue is that on a StaleConfigException from a query (handled in the mongos at s/request.cpp), the steps to update the cached shard information from the config server in 1.8.3 no longer always reload the ChunkManager for collections that have not changed. It seems like the assumption is that the mongod is more up-to-date than the shard, and so we should not need to to call setShardVersion on the mongod unless the mongos config information (ChunkManager) changes (it always changes on reload in 1.8.2). If somehow the mongod sharding metadata is less up-to-date than the mongos, the query will be retried repeatedly until it fails.

Fix may be to reload the chunk manager after the second retry, in order to handle this case.

Not sure at the moment how this state could come about.



 Comments   
Comment by Greg Studer [ 12/Dec/11 ]

see SERVER-4118 for 2.0.1 issues - this ticket is related but has been fixed in 2.0.1.

Comment by Kiril Savino [ 23/Nov/11 ]

Also seeing a ton of this in the logs, in 2.0.1, after a relatively recent failover, on the primary node.

Comment by Zeph Wang [ 01/Nov/11 ]

I'm seeing a lot of these message in 2.0.1 mongod/mongos logs. Are they related to this issue?
"shard version not ok in Client::Context: client in sharded mode, but doesn't have version set for this collection"

Comment by auto [ 06/Sep/11 ]

Author:

{u'login': u'erh', u'name': u'Eliot Horowitz', u'email': u'eliot@10gen.com'}

Message: when running checkShardVersion, need to make sure we do on actual connection, not replica set connection SERVER-3683

Conflicts:

s/shard_version.cpp
Branch: v1.8
https://github.com/mongodb/mongo/commit/09afe30a6081d802141e671f43ca330eccd3528c

Comment by auto [ 04/Sep/11 ]

Author:

{u'login': u'erh', u'name': u'Eliot Horowitz', u'email': u'eliot@10gen.com'}

Message: test for SERVER-3683
Branch: master
https://github.com/mongodb/mongo/commit/68c94172053cb15a3692d1f5f933fcfd67bd8add

Comment by auto [ 04/Sep/11 ]

Author:

{u'login': u'erh', u'name': u'Eliot Horowitz', u'email': u'eliot@10gen.com'}

Message: when running checkShardVersion, need to make sure we do on actual connection, not replica set connection SERVER-3683
Branch: master
https://github.com/mongodb/mongo/commit/8d72a36f12c23f6ed7754bd825262e70a6bd426c

Generated at Thu Feb 08 03:03:44 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.