[SERVER-5065] a missing field nested within an array is sorted as 'null', even if the same field exists inside another array element Created: 24/Feb/12 Updated: 17/Aug/17 Resolved: 15/Aug/17 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Querying |
| Affects Version/s: | 2.0.2, 2.1.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Andrew Morrow (Inactive) | Assignee: | David Storch |
| Resolution: | Done | Votes: | 3 |
| Labels: | query_triage | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||
| Participants: | |||||||||||||||||||||
| Description |
|
Background Sorting If there are multiple values for a field, only one of them is used to sort a document. For example when sorting the document { a:[ 1, 3, 2 ] } using sort spec { a:1 } only one of the values of 'a' is used to order the document. The value of 'a' chosen for sorting is the first that would be encountered while traversing an index on the sort spec (in this case { a:1 }) to resolve the query. For example if the query is {}, the first value of 'a' encountered on index { a:1 } is 1. If the query is { a:{ $gte:2 } } the first value is 2. Note that this policy is implemented regardless of whether there is actually an index { a:1 } or if sorting is performed using the { a:1 } ordering in memory, without an index. (The results must be the same regardless of whether or not an index is used.) Consider the case of document { a:[ 1, null, 2 ] }. In this example, 2 is the largest value of 'a'. It is the first value of 'a' that would be encountered for index { a:-1 } and is the value used for sorting. The smallest value of 'a' is 'null'. It is the first value of 'a' that would be encountered for index { a:1 } and is the one used for sorting. Furthermore, given two documents { _id:'x', a:[ 5, null ] } and { _id:'y', a:[ 2, 3 ] }, if sorting is performed according to order { a:1 } the documents will be ordered with document x followed by document y because null is less than 2. This behavior was requested in Index Key Extraction In many cases, indexing a nested field within an array uses an existing value from the document. Eg for index { 'a.b':1 } document { a:[ { b:5 } ] } -> produces index key 'a.b':5 If the nested field is missing however, a null value is stored in the index document { } (no 'a' field present) -> produces index key 'a.b':null In the case where some array values have an 'a.b' field and some do not, a mixture of null and non null index keys is produced: { a:[ { b:5 }, {} ] } -> 'a.b':5, 'a.b':null Query Matching Semantics For a simple query, a request for null will match missing values only if there are no non missing values for the key. But a request for null will always match an explicit null value. For query { 'a.b':null } { } (empty document) matches However, the $elemMatch operator will restrict matching to individual array elements. If a single array element has a missing 'b', the document will match null. For query { a:{ $elemMatch:{ b:null } } } { a:[ { b:2 } ] } does not match because the only value of b is 2 This behavior was requested in Observed behavior in this ticket: Proposed behavior for this ticket: Aaron ----------------------------------------------------------------------------------------------- The attached test case documents and reproduces the problem, but the tl;dr is that if you have a colleciton with documents like: { 'x' : [ { 'subfield': x }, { 'other_subfield': a } ] } Then trying to sort on x.subfield works in a DESCENDING sort, but fails to order the records on the selected subfield in an ASCENDING sort. |
| Comments |
| Comment by David Storch [ 15/Aug/17 ] | ||||||||||||||||||||||||||||||||||||||
|
Hi all, We have settled on consistent array sort behavior for both the find and aggregate commands under The undesirable behavior here has more to do with our indexing semantics for null than with sorting. Please feel free to reach out here with any followup questions. Best, | ||||||||||||||||||||||||||||||||||||||
| Comment by Ron Avnur [ 15/Feb/13 ] | ||||||||||||||||||||||||||||||||||||||
|
aaron, proposed behavior in description sounds reasonable to me. | ||||||||||||||||||||||||||||||||||||||
| Comment by Aaron Staple [ 24/Oct/12 ] | ||||||||||||||||||||||||||||||||||||||
|
As a workaround you can use the query portion of your find to ensure that the sort field does not match null, eg using original example from the description:
| ||||||||||||||||||||||||||||||||||||||
| Comment by Wei Kong [ 21/Aug/12 ] | ||||||||||||||||||||||||||||||||||||||
|
This is affecting our production deployment. We provide a cloud services to app developers and the sort is currently broken for all developers and their apps. Please bump up the priority if you can. Thanks! | ||||||||||||||||||||||||||||||||||||||
| Comment by Andrew Morrow (Inactive) [ 27/Feb/12 ] | ||||||||||||||||||||||||||||||||||||||
|
Hi Aaron - I just noticed that the 'Affects Versions' field was changed from 2.0.2 to 2.1.0 for this ticket. I definitely experience this problem with 2.0.2, and having the 'affects' set to 2.1 makes it look as if this issue only affects the unstable version. Does setting it to 2.1 mean that this will not be fixed in the stable/2.0 release series? | ||||||||||||||||||||||||||||||||||||||
| Comment by Aaron Staple [ 24/Feb/12 ] | ||||||||||||||||||||||||||||||||||||||
|
Haven't fully checked, but I think the issue is that the {'other_subfield':a} sub document is creating an invisible null key for 'x.subfield' (see
|