[SERVER-14761] split command should only allow NumberLongs for hashed shard keys Created: 01/Aug/14  Updated: 06/Dec/17  Resolved: 28/Aug/17

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 2.7.4
Fix Version/s: 3.5.13

Type: Improvement Priority: Major - P3
Reporter: Kevin Pulo Assignee: Hugh Han
Resolution: Done Votes: 0
Labels: neweng
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Documented
is documented by DOCS-9551 Docs for SERVER-14761: split command ... Closed
Related
related to SERVER-9931 hashed shard keys do not appear to ha... Closed
related to SERVER-14759 Splitting very close to an existing d... Closed
Backwards Compatibility: Fully Compatible
Sprint: Sharding 2017-07-10, Sharding 2017-07-31
Participants:

 Description   

When using a hashed shard key, the hashes are all NumberLong values. However, the split command allows splitting at values that are not NumberLongs. Such split attempts are non-sensical and should be rejected.

One particular case is splitting at a double-precision value. Since doubles and NumberLongs are sorted according to the actual numeric value, this leads to chunks with double precision min/max (or mixed double/NumberLong), rather than exclusively NumberLongs. The problem gradually worsens as chunks are auto-split, until eventually SERVER-14759 is hit.

However, this is still a problem in its own right, because not all NumberLongs can be represented as doubles (which is of course why NumberLongs are used in the first place). Thus, even in the absence of SERVER-14759 this is still a problem, because it prevents chunks in a hashed shard key from being split as finely as they ought to be. This may lead to uneven load between shards, which is against the design goals of hashed shard keys.

It is true that auto-splits are always correctly calculated as NumberLongs, but there are some situations where manual splits are required. In this case, SERVER-14217 means that it is easy to accidentally split at double precision values instead of NumberLongs.



 Comments   
Comment by Githook User [ 13/Jul/17 ]

Author:

{u'username': u'hughhan1', u'name': u'Hugh Han', u'email': u'hughhan1@gmail.com'}

Message: SERVER-14761 only allow NumberLong as split key in hashed shard patterns

When a shard uses a hashed key pattern, only NumberLong types may be used
as split keys.

A small bug in hash_basic.js regarding 'middle' syntax was also fixed.
Instead of 10000 to the document as a whole, 10000 is now added to the
actual number in the document.

A small bug in read_only_test.js regarding the shardColl(...) function
was also found. It was fixed to now do what I believe the author originally
intended it to do.
Branch: master
https://github.com/mongodb/mongo/commit/9e478e41643e736104e15d6bc7a3065c19b37e17

Comment by Kevin Pulo [ 21/Oct/15 ]

No doubt it is a problem, but to me, so is this:

> sh.enableSharding("foo")
{ "ok" : 1 }
> sh.shardCollection("foo.bar", { _id: "hashed" } )
{ "collectionsharded" : "foo.bar", "ok" : 1 }
> sh.splitAt("foo.bar", { _id: 10 } )
{ "ok" : 1 }
> sh.splitAt("foo.bar", { _id: NumberLong("11") } )
{ "ok" : 1 }
> sh.splitAt("foo.bar", { _id: "foobar" } )
{ "ok" : 1 }
> sh.status()
...
                foo.bar
                        shard key: { "_id" : "hashed" }
                        chunks:
                                shard01 5
                        { "_id" : { "$minKey" : 1 } } -->> { "_id" : NumberLong(0) } on : shard01 Timestamp(1, 1)
                        { "_id" : NumberLong(0) } -->> { "_id" : 10 } on : shard01 Timestamp(1, 3)
                        { "_id" : 10 } -->> { "_id" : NumberLong(11) } on : shard01 Timestamp(1, 5)
                        { "_id" : NumberLong(11) } -->> { "_id" : "foobar" } on : shard01 Timestamp(1, 7)
                        { "_id" : "foobar" } -->> { "_id" : { "$maxKey" : 1 } } on : shard01 Timestamp(1, 8)

This resulting config doesn't really have any meaning: given that hashed indexes are always NumberLong, why should anything else be allowed?

I would rather these attempted splits be refused:

> sh.splitAt("foo.bar", { _id: 10 } )
{
        "ok" : 0,
        "errmsg" : "split point { _id: 10.0 } is not NumberLong, which is required for hashed shard keys"
}

Something would still need to be done about any already-existing non-NumberLong values that had made their way into hashed shard key chunk endpoints (eg. upconvert if possible?).

SERVER-14217 is an exacerbating factor, and where these values often come from in the field, ie. users are trying to manually split chunks, without realising that NumberLongs are automatically and silently coerced to floats when arithmetic is done on them.

> NumberLong(3) + NumberLong(2)
5
> typeof(NumberLong(3) + NumberLong(2))
number

Comment by Andy Schwerin [ 19/Oct/15 ]

Seems to me that SERVER-14759 is the real problem, here.

Generated at Thu Feb 08 03:35:54 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.