[SERVER-13426] weird $sort behavior Created: 31/Mar/14  Updated: 23/Sep/15  Resolved: 25/Apr/14

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: 2.4.9, 2.6.0-rc2
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Steve Duan Assignee: David Storch
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
is duplicated by SERVER-20551 sort on multikey index after $elemMat... Closed
Related
related to SERVER-11878 Results in incorrect order for sharde... Closed
Operating System: ALL
Participants:

 Description   

>db.test.drop()
>db.test.insert([{a:[{b:{c:3}}, 1]}, {a:[{b:{c:5}},1]}, {a:[{b:{c:4}},1]}])
>db.test.find().sort({'a.b.c':1})
{ "_id" : ObjectId("5339c1f58b8794d6f2676019"), "a" : [ { "b" : { "c" : 3 } }, 1 ] }
{ "_id" : ObjectId("5339c1f58b8794d6f267601a"), "a" : [ { "b" : { "c" : 5 } }, 1 ] }
{ "_id" : ObjectId("5339c1f58b8794d6f267601b"), "a" : [ { "b" : { "c" : 4 } }, 1 ] }

should the {a:[{b:{c:5}},1]} be the last doc? They do if they don't have the 1 in the array.

>db.test.drop()
>db.test.insert([{a:[{b:{c:3}}]}, {a:[{b:{c:5}}]}, {a:[{b:{c:4}}]}])
>db.test.find().sort({'a.b.c':1})
{ "_id" : ObjectId("5339d2a26d764829de404bec"), "a" : [ { "b" : { "c" : 3 } } ] }
{ "_id" : ObjectId("5339d2a26d764829de404bee"), "a" : [ { "b" : { "c" : 4 } } ] }
{ "_id" : ObjectId("5339d2a26d764829de404bed"), "a" : [ { "b" : { "c" : 5 } } ] }



 Comments   
Comment by David Storch [ 25/Apr/14 ]

We have decided that the sort behavior is okay as it currently stands. Closing as Works as Designed.

Comment by J Rassi [ 31/Mar/14 ]

To improve clarity of this issue, I will explain the existing sort semantics.

When a document generates a single sort key (e.g. the document {a:"hello"} or the document {a:["hello"]} for the sort spec {a:1}), there's no ambiguity as to which sort key is used for the ordering (since there's only one). When a document generates multiple sort keys, there is a process to determine which sort key is used for the ordering: the sort keys are lined up according to the sort order, and then the first sort key that matches the query predicate is used for the ordering.

Consider this simpler example:

> db.test.insert([{_id:0,a:[3, null]}, {_id:1,a:[5, null]}, {_id:2,a:[4, null]}])
> db.test.find({}).sort({a:1})
{ "_id" : 0, "a" : [ 3, null ] }
{ "_id" : 1, "a" : [ 5, null ] }
{ "_id" : 2, "a" : [ 4, null ] }

To determine the sort key for the {_id: 0} document, its candidate sort keys are arranged with respect to the sort spec: [null, 3]. Since null (the first one) matches the query predicate {}, null is chosen for that document as the definitive sort key for this query. The same process is run with the other documents to finally generate the three sort keys [null, null, null] for the three documents. Since all three documents "tie" in the sort by generating the same sort key, the order they appear in the query is not defined.

Now consider a query with the same sort but a different query predicate:

> db.test.find({a: {$gt: 0, $lt: 10}}).sort({a:1})
{ "_id" : 0, "a" : [ 3, null ] }
{ "_id" : 2, "a" : [ 4, null ] }
{ "_id" : 1, "a" : [ 5, null ] }

The {_id: 0} sort keys arranged with respect to the sort spec are still [null, 3], but this time the first key (null) doesn't match the query predicate 0<a<10, but the second key (3) does. So, the sort keys [3, 4, 5] are chosen for the three documents, and the results of the query are ordered as such.

This example is actually isomorphic to the example in your report (the sort key for the document {a: [1]} is null for the sort spec {"a.b.c": 1}, so your first document generates the sort keys [3, null]).

Note in addition SERVER-11878 (this behavior is broken for sort on a sharded collection) and David's comment above that this behavior may be changed in a future release.

Comment by David Storch [ 31/Mar/14 ]

Hi Steve,

For better or for worse this is how sort semantics worked in version 2.4.9. This behavior has been preserved for the upcoming 2.6 release so as to avoid a backwards breaking change. I've labeled this ticket as "Needs Triage"; our team will review in order to decide whether or not we want to change the sort semantics in the future.

Generated at Thu Feb 08 03:31:42 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.