[SERVER-4262] when dropping collections need to invalidate all conn sharding state Created: 11/Nov/11  Updated: 11/Jul/16  Resolved: 12/Jun/12

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: 2.1.2

Type: Bug Priority: Major - P3
Reporter: Greg Studer Assignee: Greg Studer
Resolution: Done Votes: 7
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File multi_coll_drop.js    
Issue Links:
Depends
depends on SERVER-5918 invalid memory writes in sharding tes... Closed
is depended on by SERVER-5150 drop collection : now errors with sha... Closed
Duplicate
is duplicated by SERVER-5753 Assertion failures when a sharded col... Closed
Related
is related to SERVER-4429 When moving primary need to inform al... Closed
Operating System: ALL
Participants:

 Description   

When dropping sharded collections on mongod, we need to invalidate all connection state for that namespace. If the collection gets recreated, this information will be stale.



 Comments   
Comment by auto [ 15/Jun/12 ]

Author:

{u'date': u'2012-06-15T13:17:50-07:00', u'email': u'greg@10gen.com', u'name': u'Greg Studer'}

Message: SERVER-4262 more refactoring from review - method names for removeDB/IfExists and comments
Branch: master
https://github.com/mongodb/mongo/commit/b3c3f37dba8102222462cbd3f7406efdf32f54d9

Comment by auto [ 15/Jun/12 ]

Author:

{u'date': u'2012-06-15T13:06:26-07:00', u'email': u'greg@10gen.com', u'name': u'Greg Studer'}

Message: SERVER-4262 refactor error handling of strategy_shard, integrate comments from review
Branch: master
https://github.com/mongodb/mongo/commit/bf3176051959030ec5785385f007eaee970b08c2

Comment by Greg Studer [ 12/Jun/12 ]

Only fixed with a full upgrade of all cluster portions to 2.1.2/2.2. Behavior of writes while the drop is in progress is undefined - they may succeed and be added to a new collection with the same name afterward, they may not.

Comment by auto [ 11/Jun/12 ]

Author:

{u'login': u'gregstuder', u'name': u'Greg Studer', u'email': u'greg@10gen.com'}

Message: SERVER-4262 fix $atomic for sharded ops - now works if sent to single shard
Branch: master
https://github.com/mongodb/mongo/commit/88100232912f0b9af6a01f454072ed723f3c830e

Comment by auto [ 11/Jun/12 ]

Author:

{u'login': u'gregstuder', u'name': u'Greg Studer', u'email': u'greg@10gen.com'}

Message: SERVER-4262 sharded indexing depends on the db sharding status

This reverts commit f1852efa8e4e5d702c472c35fb45d0a29a90d7d3.
Branch: master
https://github.com/mongodb/mongo/commit/66eb52c108d9d91fed9e2bfffe52b28e4418aea3

Comment by auto [ 11/Jun/12 ]

Author:

{u'login': u'gregstuder', u'name': u'Greg Studer', u'email': u'greg@10gen.com'}

Message: SERVER-4262 ref not value

This reverts commit 2deb6a2f1b457adc58436f11dd730d1fcc525084.
Branch: master
https://github.com/mongodb/mongo/commit/c11ff4259ba77a037afb075786e3f3fec92c5a33

Comment by auto [ 11/Jun/12 ]

Author:

{u'login': u'gregstuder', u'name': u'Greg Studer', u'email': u'greg@10gen.com'}

Message: SERVER-4262 test that invalid writes are detected even when mongos initially stale

This reverts commit ccd568b6e292b4641f60c704ddf983240e8f23a4.
Branch: master
https://github.com/mongodb/mongo/commit/5283419448b2251d20a6e0e084c5fb70c4b6bd49

Comment by auto [ 11/Jun/12 ]

Author:

{u'login': u'gregstuder', u'name': u'Greg Studer', u'email': u'greg@10gen.com'}

Message: SERVER-4262 and SERVER-4732 make wbl less aggressive and more tolerant of dropping sharded collections

This reverts commit 38a258f13283440eb3f19afbb653d872b04d0d98.
Branch: master
https://github.com/mongodb/mongo/commit/ec289d975679f18661f191776b1fca8d0bf4b14d

Comment by auto [ 11/Jun/12 ]

Author:

{u'login': u'gregstuder', u'name': u'Greg Studer', u'email': u'greg@10gen.com'}

Message: SERVER-4262 remove race condition on config reload in sharded remove

This reverts commit 2e12c38432817131bfa463e763198436ac82d115.
Branch: master
https://github.com/mongodb/mongo/commit/74d284ac1cbd4ce36d0396d7c6ac18821aba1fa2

Comment by auto [ 11/Jun/12 ]

Author:

{u'login': u'gregstuder', u'name': u'Greg Studer', u'email': u'greg@10gen.com'}

Message: SERVER-5200 and SERVER-4262 remove race condition with simultaneous config reload from update

This reverts commit e123c445170548c0eb248dc1b221dfb09aa9b1ec.
Branch: master
https://github.com/mongodb/mongo/commit/e8f0424f33455017086b2c986fdb572c4cad2dbd

Comment by auto [ 09/Jun/12 ]

Author:

{u'login': u'gregstuder', u'name': u'gregs', u'email': u'greg@10gen.com'}

Message: Revert "SERVER-5200 and SERVER-4262 remove race condition with simultaneous config reload from update"

This reverts commit 9494663f0ab571215dba96f10cdca78a1a36cddc.
Branch: master
https://github.com/mongodb/mongo/commit/e123c445170548c0eb248dc1b221dfb09aa9b1ec

Comment by auto [ 09/Jun/12 ]

Author:

{u'login': u'gregstuder', u'name': u'gregs', u'email': u'greg@10gen.com'}

Message: Revert "SERVER-4262 remove race condition on config reload in sharded remove"

This reverts commit d7f89643a917538fa953a17d80acc164fb4885ad.
Branch: master
https://github.com/mongodb/mongo/commit/2e12c38432817131bfa463e763198436ac82d115

Comment by auto [ 09/Jun/12 ]

Author:

{u'login': u'gregstuder', u'name': u'gregs', u'email': u'greg@10gen.com'}

Message: Revert "SERVER-4262 and SERVER-4732 make wbl less aggressive and more tolerant of dropping sharded collections"

This reverts commit bc4475ebcb2548e21fadf44549b73f61f0c577ce.
Branch: master
https://github.com/mongodb/mongo/commit/38a258f13283440eb3f19afbb653d872b04d0d98

Comment by auto [ 09/Jun/12 ]

Author:

{u'login': u'gregstuder', u'name': u'gregs', u'email': u'greg@10gen.com'}

Message: Revert "SERVER-4262 test that invalid writes are detected even when mongos initially stale"

This reverts commit a65d2bc01f263ccefc1a7392f48b362338c83daf.
Branch: master
https://github.com/mongodb/mongo/commit/ccd568b6e292b4641f60c704ddf983240e8f23a4

Comment by auto [ 09/Jun/12 ]

Author:

{u'login': u'gregstuder', u'name': u'gregs', u'email': u'greg@10gen.com'}

Message: Revert "SERVER-4262 ref not value"

This reverts commit 4c881e3c93ee29fa4ca22d9da7b7a45150448455.
Branch: master
https://github.com/mongodb/mongo/commit/2deb6a2f1b457adc58436f11dd730d1fcc525084

Comment by auto [ 09/Jun/12 ]

Author:

{u'login': u'gregstuder', u'name': u'gregs', u'email': u'greg@10gen.com'}

Message: Revert "SERVER-4262 sharded indexing depends on the db sharding status"

This reverts commit 13aa0a0276eb239454da00d65f8f15aa1a97bb10.
Branch: master
https://github.com/mongodb/mongo/commit/f1852efa8e4e5d702c472c35fb45d0a29a90d7d3

Comment by auto [ 09/Jun/12 ]

Author:

{u'login': u'gregstuder', u'name': u'gregs', u'email': u'greg@10gen.com'}

Message: SERVER-4262 sharded indexing depends on the db sharding status
Branch: master
https://github.com/mongodb/mongo/commit/13aa0a0276eb239454da00d65f8f15aa1a97bb10

Comment by auto [ 09/Jun/12 ]

Author:

{u'login': u'gregstuder', u'name': u'gregs', u'email': u'greg@10gen.com'}

Message: SERVER-4262 ref not value
Branch: master
https://github.com/mongodb/mongo/commit/4c881e3c93ee29fa4ca22d9da7b7a45150448455

Comment by auto [ 09/Jun/12 ]

Author:

{u'login': u'gregstuder', u'name': u'gregs', u'email': u'greg@10gen.com'}

Message: SERVER-4262 test that invalid writes are detected even when mongos initially stale

also requires a fix for the WBL to report errors more nicely
Branch: master
https://github.com/mongodb/mongo/commit/a65d2bc01f263ccefc1a7392f48b362338c83daf

Comment by auto [ 09/Jun/12 ]

Author:

{u'login': u'gregstuder', u'name': u'Greg Studer', u'email': u'greg@10gen.com'}

Message: SERVER-4262 and SERVER-4732 make wbl less aggressive and more tolerant of dropping sharded collections

also better track and message how many writebacks are being processed
at each version
Branch: master
https://github.com/mongodb/mongo/commit/bc4475ebcb2548e21fadf44549b73f61f0c577ce

Comment by auto [ 09/Jun/12 ]

Author:

{u'login': u'gregstuder', u'name': u'Greg Studer', u'email': u'greg@10gen.com'}

Message: SERVER-4262 remove race condition on config reload in sharded remove

again, also refactor option parameters
Branch: master
https://github.com/mongodb/mongo/commit/d7f89643a917538fa953a17d80acc164fb4885ad

Comment by auto [ 09/Jun/12 ]

Author:

{u'login': u'gregstuder', u'name': u'Greg Studer', u'email': u'greg@10gen.com'}

Message: SERVER-5200 and SERVER-4262 remove race condition with simultaneous config reload from update

also refactor to make option-passing cleaner
Branch: master
https://github.com/mongodb/mongo/commit/9494663f0ab571215dba96f10cdca78a1a36cddc

Comment by auto [ 09/Jun/12 ]

Author:

{u'login': u'gregstuder', u'name': u'Greg Studer', u'email': u'greg@10gen.com'}

Message: SERVER-4262 refactor mongos insert logic, make more modular, normalize options
Branch: master
https://github.com/mongodb/mongo/commit/5390d21c7d997ba2447d8a0a58e95d208b5d4218

Comment by auto [ 09/Jun/12 ]

Author:

{u'login': u'gregstuder', u'name': u'Greg Studer', u'email': u'greg@10gen.com'}

Message: SERVER-4262 refactor and cleanly catch exception when getMore cursor isn't found

Otherwise propagates as a "HostAndPort host empty" error, which is very misleading.
Branch: master
https://github.com/mongodb/mongo/commit/96b1b65dc66966fe8f17a30c5bde6036e2c8588d

Comment by auto [ 09/Jun/12 ]

Author:

{u'login': u'gregstuder', u'name': u'Greg Studer', u'email': u'greg@10gen.com'}

Message: SERVER-4262 make dropping sharded databases and collections safer for multiple processes

In particular, a database drop may abort at any collection which has a lock taken, and so
we want retries because of this to do the right thing.
Branch: master
https://github.com/mongodb/mongo/commit/a0a9c81ca415f9c9b3c5e7bc2379a147d85e67dc

Comment by auto [ 08/Jun/12 ]

Author:

{u'login': u'gregstuder', u'name': u'Greg Studer', u'email': u'greg@10gen.com'}

Message: SERVER-4262 error codes
Branch: master
https://github.com/mongodb/mongo/commit/7b491d422e691365c84e5cb6d13a230c1bf307f4

Comment by auto [ 08/Jun/12 ]

Author:

{u'login': u'gregstuder', u'name': u'Greg Studer', u'email': u'greg@10gen.com'}

Message: SERVER-4262 better messaging when reloading full shard version of a database
Branch: master
https://github.com/mongodb/mongo/commit/717f72638db7686e59c05405673a0e783ff2e600

Comment by auto [ 08/Jun/12 ]

Author:

{u'login': u'gregstuder', u'name': u'Greg Studer', u'email': u'greg@10gen.com'}

Message: SERVER-4262 normalize options for write operations and refactor into common location
Branch: master
https://github.com/mongodb/mongo/commit/529820fd434192470a91ef1cd257ee58bb0e22a5

Comment by auto [ 08/Jun/12 ]

Author:

{u'login': u'gregstuder', u'name': u'Greg Studer', u'email': u'greg@10gen.com'}

Message: SERVER-4262 better commenting of coll_epoch_test1.js
Branch: master
https://github.com/mongodb/mongo/commit/e14233f0213fd446381bbccdbf9b9c63b485c1c1

Comment by auto [ 08/Jun/12 ]

Author:

{u'login': u'gregstuder', u'name': u'Greg Studer', u'email': u'greg@10gen.com'}

Message: SERVER-4262 correct reload logic in sharded queries for epoch changes
Branch: master
https://github.com/mongodb/mongo/commit/72853a054d26271cc10249a84f94aa5c6697c34f

Comment by auto [ 08/Jun/12 ]

Author:

{u'login': u'gregstuder', u'name': u'Greg Studer', u'email': u'greg@10gen.com'}

Message: SERVER-4262 fixes for regression in mongod setshardversion logic with epoch

need to check epoch when comparing non-zero, otherwise we'll never reload different epochs
Branch: master
https://github.com/mongodb/mongo/commit/d5dd0299041af8858b89199d019dfdcc7c4ca8eb

Comment by auto [ 08/Jun/12 ]

Author:

{u'login': u'gregstuder', u'name': u'Greg Studer', u'email': u'greg@10gen.com'}

Message: SERVER-4262 add future-compatible BSON parsing of shard versions, with unit test
Branch: master
https://github.com/mongodb/mongo/commit/b0a992a951ba20ff56782769306c36e6df292bed

Comment by auto [ 08/Jun/12 ]

Author:

{u'login': u'gregstuder', u'name': u'gregs', u'email': u'greg@10gen.com'}

Message: SERVER-4262 Correct detection of sharded collection dropping when performing various ops through mongos.

Insert has been updated to correctly detect unsharded collections without errors, update and remove still contain race conditions partially mitigated by the WBL fix here.

Also includes tests of this functionality.
Branch: master
https://github.com/mongodb/mongo/commit/c2895e67e383fe6af4355454cbe363a0bd8487a0

Comment by auto [ 08/Jun/12 ]

Author:

{u'login': u'gregstuder', u'name': u'gregs', u'email': u'greg@10gen.com'}

Message: SERVER-4262 Make mongod reject based on epoch
Branch: master
https://github.com/mongodb/mongo/commit/b5367c120d2bbf7432419bbf769628468c8d57fa

Comment by auto [ 08/Jun/12 ]

Author:

{u'login': u'gregstuder', u'name': u'gregs', u'email': u'greg@10gen.com'}

Message: SERVER-4262 Refactor config reloading to better handle refactored chunk manager
Branch: master
https://github.com/mongodb/mongo/commit/03e443e874e3e04de1f3ef87bc2aef629afbdaec

Comment by auto [ 08/Jun/12 ]

Author:

{u'login': u'gregstuder', u'name': u'gregs', u'email': u'greg@10gen.com'}

Message: SERVER-5942 and SERVER-4262 Detect changes to epoch when reloading chunks
Branch: master
https://github.com/mongodb/mongo/commit/827bd4a9aa483bec4a0b1f237a6423c3b2489135

Comment by auto [ 22/May/12 ]

Author:

{u'login': u'gregstuder', u'name': u'gregs', u'email': u'greg@10gen.com'}

Message: SERVER-4262 allow independent testing of ChunkManager
Branch: master
https://github.com/mongodb/mongo/commit/d1fd245ea8ec1a75974169fc0f07d21dc3217a96

Comment by auto [ 21/May/12 ]

Author:

{u'login': u'gregstuder', u'name': u'gregs', u'email': u'greg@10gen.com'}

Message: SERVER-4262 refactor chunk manager and ensure epochs are created correctly
Branch: master
https://github.com/mongodb/mongo/commit/4c1699f9bc6a10c82746b78ea86bb947e7033038

Comment by auto [ 21/May/12 ]

Author:

{u'login': u'gregstuder', u'name': u'Greg Studer', u'email': u'greg@10gen.com'}

Message: SERVER-4262 Changes to get OID into the ShardChunkVersion

Refactoring to allow better tracking of when we compare versions, since this needs to be changed.
Branch: master
https://github.com/mongodb/mongo/commit/76d99d60cad11017c00a832e2bc467f1c46fa5fc

Comment by Eliot Horowitz (Inactive) [ 17/Apr/12 ]

@Y Wayne: There are tests that test this functionality, but there were edge cases they didn't catch.
We're working on a full fix, but its not yet finished.
Once it is - we will then be able to determine if it is safe to backport to 2.0
This issue is being worked on right now by engineers, but want to make sure the fix is complete and handles all possible issues for the foreseeable future.

Comment by Y. Wayne Huang [ 17/Apr/12 ]

all of these version-related issues and the fact that none of the fixes are backported to 2.0 really speaks volumes about priorities at 10gen. nobody has bothered to test very simple use cases that cause these problems before releasing a "stable" version. instead of properly fixing the issues and backporting to stable, we are told to implement ridiculous workarounds like restarting mongos or running experimental versions. seriously, wtf guys?

Comment by Greg Studer [ 17/Apr/12 ]

For these cases, another workaround that avoids version issues and restarting is to append OIDs to recycled namespaces, so that the namespaces aren't re-used but new namespaces are continually created.

Comment by Y. Wayne Huang [ 17/Apr/12 ]

it's unfortunate that this is not being backported to 2.0. restarting mongos hardly seems like a suitable workaround, particularly for workloads where collections are automatically created/removed periodically.

Comment by David Nesbitt [ 17/Apr/12 ]

Workaround from Kristina Chodorow:

"You could try running the flushRouterConfig command, but you will
probably actually have to restart the mongos to clear the state."

http://groups.google.com/group/mongodb-user/browse_thread/thread/29898a80b4aed670

Comment by auto [ 14/Nov/11 ]

Author:

{u'login': u'erh', u'name': u'Eliot Horowitz', u'email': u'eliot@10gen.com'}

Message: fix case where a sharded collection is dropped and we need to reset state SERVER-4262
Branch: master
https://github.com/mongodb/mongo/commit/aa079a1389ff589898194671babcb8307d1d3085

Comment by Eliot Horowitz (Inactive) [ 14/Nov/11 ]

The above commit fixes the test, but there probably is a better way to do this pre-emptively.
This patch still makes sense though, just not most efficient.

Comment by Greg Studer [ 13/Nov/11 ]

Attached test case which reproduces the issue (on 2.0 at least).

Generated at Thu Feb 08 03:05:28 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.