[SERVER-8187] Usability issues with splitting + hashed shard keys Created: 15/Jan/13  Updated: 27/Oct/15  Resolved: 01/Feb/13

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: 2.4.0-rc1

Type: Bug Priority: Major - P3
Reporter: Kristina Chodorow (Inactive) Assignee: Randolph Tan
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
depends on DOCS-1043 moveChunk and split commands have/wil... Closed
Related
related to DOCS-990 Document bounds parameter of moveChunk Closed
related to SERVER-8335 remember _dataWritten for unaffected ... Closed
is related to SERVER-4172 moveChunk need to specify find error ... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Steps To Reproduce:

5 shards, create a hashed shard key on user_id, populate with 1 million docs of the form

{user_id:"user"+Math.floor(Math.random()*10000), date:new Date(), something:Math.random()}

. Try to move a chunk.

Participants:

 Description   

It seems difficult to split a chunk that is too big to move when using a hashed shard key.

splitFind claimed to split it, but didn't:

config> db.chunks.find({shard:"test-rs4"}).count()
12
config> sh.splitFind("test.hashy", {'user_id' : {$gt :  NumberLong("6136905156946055959"), $lt : NumberLong("6376878206583911474")}})
{ "ok" : 1 }
config> db.chunks.find({shard:"test-rs4"}).count()
12

split+middle gave the fantabulous error message:

> db.adminCommand({split:"test.hashy", middle:{user_id:NumberLong("6236905156946055959")}})
{
        "cause" : {
                "errmsg" : "exception: can split { user_id: 4525968722311181770 } -> { user_id: 4914820391477438346 } on { user_id: 6236905156946055959 }",
                "code" : 14040,
                "ok" : 0
        },
        "ok" : 0,
        "errmsg" : "split failed"
}

(NumberLong("6236905156946055959") was chosen randomly, just in case you could pass in a hashed value and have it work).

It also seems easy to end up in this situation: I randomly populated a collection with 1 million docs and a 1MB chunk chunk size and no chunk I've tried so far has been small enough to move.



 Comments   
Comment by auto [ 01/Feb/13 ]

Author:

{u'date': u'2013-01-25T23:35:51Z', u'email': u'randolph@10gen.com', u'name': u'Randolph Tan'}

Message: SERVER-8187 Usability issues with splitting + hashed shard keys

Added new option for splitting keys on a cluster using hashed shard keys.
Branch: master
https://github.com/mongodb/mongo/commit/0a590e42086869bb05b2bff36fb104f0f91ba98a

Comment by Randolph Tan [ 01/Feb/13 ]

We decided to be paranoid and require the user to specify both bounds. This is also consistent with the new bounds option for moveChunk.

Comment by Kristina Chodorow (Inactive) [ 01/Feb/13 ]

Why do we need to specify upper and lower bound here? Isn't a chunk uniquely specified by its lower bound?

Comment by Randolph Tan [ 01/Feb/13 ]

New option for moveChunk: bounds. The bounds accepts an array with two elements, which specifies the bounds of the chunk to split. Ex:

{ split: 'test.user', bounds: [{ x: 0 }, { x: 100 }] }

Note: bounds cannot be used together with middle or find. Bounds is useful for splitting a chunk with hashed shard keys.

Comment by Greg Studer [ 25/Jan/13 ]

@kristina - see SERVER-8335

Comment by Kristina Chodorow (Inactive) [ 17/Jan/13 ]

There's one more issue (that maybe should be a new ticket): I used a single mongos for all inserts, but all of the chunks were too big to move. Shouldn't mongos have split the chunks more on insert?

Comment by Kristina Chodorow (Inactive) [ 16/Jan/13 ]

It would be nice to be able to specify a chunk and just have mongod find a split point for it.

Generated at Thu Feb 08 03:16:47 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.