[SERVER-8772] Documents with null value for hashed shard key are not returned via mongos Created: 27/Feb/13 Updated: 11/Jul/16 Resolved: 04/Mar/13 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | 2.4.0-rc1 |
| Fix Version/s: | 2.4.0-rc2 |
| Type: | Bug | Priority: | Critical - P2 |
| Reporter: | Ed Costello | Assignee: | Aaron Staple |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
OS X, 3 shards on 3 replica sets, 1 mongos |
||
| Attachments: |
|
||||
| Issue Links: |
|
||||
| Operating System: | ALL | ||||
| Steps To Reproduce: | Import the 2009 NYS campaign finance database http://www.elections.ny.gov/NYSBOE/download/ZipDataFiles/2009gen.zip using mongoimport -d qa -c gen2009 -fieldFile headers.h --ignoreBlanks --type csv 2009gen.out (note that headers.h is attached to this ticket) ) ) should return 7463 documents ) ) -returns no documents ).count() returns 7463 ).count() returns 7322 documents |
||||
| Participants: | |||||
| Description |
|
I have a collection of 100k+ documents where about 7% are missing the field used for a hashed shard key. While you cannot shard a collection where some documents are missing the shard key field, you can if you used a hashed shard key. Prior to sharding, queries on the field (using {CITY: null}) succeed, after sharding they appear to fail. The query is directed to the shards and they appear to process it, but mongos does not return any documents. Only happens if the field with null values is the shard key. Have reproduced with mongo shell and pymongo, have not narrowed down enough to write JS test case. Am not seeing any obvious errors in my mongos or mongod logs. |
| Comments |
| Comment by Ed Costello [ 06/Mar/13 ] | ||
|
Am retesting this afternoon (6 March) | ||
| Comment by Aaron Staple [ 04/Mar/13 ] | ||
|
commit 696dec1262372b0ac45bad9c84de4700eb0d2e71 Make the IndexSpec::missingField() implementation IndexType specific, and use missingField() to properly identify missing fields in CheckShardingIndex::run(). | ||
| Comment by Daniel Pasette (Inactive) [ 02/Mar/13 ] | ||
|
Adding a test case for sharding on existing collection using single, compound and hashed shard key. Can special case hashed shard keys and check for hashed value of null instead. | ||
| Comment by Aaron Staple [ 02/Mar/13 ] | ||
|
This is not code I'm super familiar with, but it looks like running the shardCollection command on mongos sends out checkShardingIndex commands to the mongods. And checkShardingIndex checks for index keys where the shard key is null, indicating that a shard key field may be absent from a document, from CheckShardingIndex::run():
Hash indexes don't store a key of null for a missing field, but instead they store the hash of null. Missing values cannot be identified by the presence of null index keys in the current hash index implementation. |