[SERVER-3393] Dropping a sharded collection and re-sharding it leads to inconsistent inserts Created: 07/Jul/11  Updated: 02/Sep/11  Resolved: 02/Sep/11

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 1.8.2
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Mike K Assignee: Unassigned
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Ubuntu Natty Narwhal, EC2


Issue Links:
Depends
depends on SERVER-1726 drop/dropDatabase concurrency when sh... Closed
Related
is related to SERVER-3392 mongos never picked up that a collect... Closed
Operating System: ALL
Participants:

 Description   

Steps to repro (has happened 3 times to us):

Given a cluster of 2 shards, (let's call them S0, S1)
And two application servers (let's call them A0, A1), each running mongos locally.
And finally, one non-application server, T0, that also has a mongos running, where we're running the admin commands
And a collection 'comments'

1. On T0, issue enablesharding and shardcollection
2. Run a script to pre-split chunks ( splitChunk and moveChunk )
3. 'comments' is now presplit evenly into 102 chunks, 51 on each
4. Start throwing live traffic from A0 and A1

At this point, we noticed the pre-split wasn't quite correct and was only writing to S1 (our mistake), so we did:

1. Stopped throwing writes at cluster
2. On T0, issued a collection .drop()
3. Issued another shardcollection with that collection name, which re-created the collection
4. Ran the corrected pre-split
5. Threw live traffic from A0 and A1

Expected behavior would be to have the mongos on A0 and A1 pick up the new chunk information; instead, all the inserts went into S1, as if the mongos on A0 and A1 never picked up the changes from the second pre-split (even though printShardingStatus() on A0 and A1's mongos looked okay).

The only solution we found was to restart mongos after issuing our presplit commands. After restarting mongos, all the writes went to the correct places, split between S0 and S1.

Seems like either a bug in mongos failing to pick up these changes, or the docs should be updated specifying that mongos must be restarted after a collection has been dropped and re-sharded.



 Comments   
Comment by Eliot Horowitz (Inactive) [ 02/Sep/11 ]

See SERVER-1726

Generated at Thu Feb 08 03:02:56 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.