[SERVER-22872] Order by is not working in 3.2.3 Created: 26/Feb/16 Updated: 04/May/16 Resolved: 29/Feb/16 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Querying |
| Affects Version/s: | 3.2.3 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | ITWEBTF SAXOBANK | Assignee: | David Storch |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||
| Steps To Reproduce: | The query:
The sorting order:
or
Notice that the documents are returned in the same order regardless of requesting ascending or descending. See the documents and indexes in the attached Word document. |
||||||||||||||||||||
| Participants: | |||||||||||||||||||||
| Description |
|
I query on a collection with three documents, expecting to get documents back ordered by the "Keys.Queryables" field. This works as expected in 2.6.1, but in 3.2.3 the documents are always returned in ascending order. |
| Comments |
| Comment by ITWEBTF SAXOBANK [ 29/Mar/16 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Hi David, It looks like I have confused you a bit with my last comment. It is true that we could avoid using the '|' character and thereby avoid the immediate problem, but our analysis showed that doing this change with our current code and data is not feasible. Please understand that sorting by an array field was not our first design choice. We struggled with full scans of collections with millions of documents, and after trying out several complex solutions that only partly solved the problem, we found that a solution based on sorting by array fields eliminated the full scans completely. In other words, we eliminated complexity in our code by utilizing what the database is good at. This is a best practice approach. To me it makes little sense that you ask us to change our code significantly - again - when it is clear that you broke backwards compatibility in 3.0. Fixing the bug that causes escaped '|' not to be recognized as such would allow us to update to 3.2.3. However, if you make the next breaking change we will be stuck on that version. Since our main reason for upgrading is that we do not want to be stuck on an old and potentially unsupported version, it makes little sense to upgrade at all. As a developer, I can partly understand why you want to keep the code clean and simple. Backwards compatibility is never easy. However, MongoDb is now supposed to be a mature product so it is not acceptable that you are so hesitant to fix backwards compatibility bugs. For now, our architect and management teams have lost faith in MongoDb. Regards, | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by David Storch [ 22/Mar/16 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Glad to hear that the modified example works as expected. Keep in mind that you should be able to work around this issue using the aggregation framework. You may also be able to use direct comparisons to strings rather than a regular expression in order to get the desired behavior.
In general I would strongly caution against sorting by an array field. The problem with the current semantics is that an internal change to the planner can affect the expected sort order! This is a very strange situation to be in. We would much rather that the sort order is a logical property which is determinate for any (data, sort spec) pair. That is why we are considering making the breaking change described in Best, | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by ITWEBTF SAXOBANK [ 01/Mar/16 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I have added a modified example in which I do not use a pipe symbol. It seems to work in 3.2.3 exactly as it did in 2.6.11! Now to the important question - will it continue to work this way in future versions? I am asking because you have advocated a breaking change that would break both 3.x and 2.x versions - I don't agree, although I can understand why you would like to fix quite a few 'ordering by array fields' issues this way. Regards, | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by ITWEBTF SAXOBANK [ 01/Mar/16 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
When there is no pipe symbol, the query analyzer does not see the query as "complex". | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by ITWEBTF SAXOBANK [ 01/Mar/16 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Hi Dave, Thanks for a thorough answer. It seems that the core of this issue is our usage of the pipe '|' symbol, as in a reg ex the usage of such symbol will change the query from "simple" to "complex". However, we do not use the pipe symbol in our reg exes, as we escape them and search for the literal character. As I have mentioned in a comment in SERVER-22890, fixing the bug that causes the escaped pipe character to be recognized as a non-escaped pipe character in reg exes could help. One question,
My gut feeling on Regards, | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by David Storch [ 29/Feb/16 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I have investigated this issue and have some results to report. I will explain some background information, then describe why you're seeing this behavior, and finally recommend some some steps for moving forward. There is quite a bit of detail here, so please let me know if anything is unclear. BackgroundThe problem query involves sorting by the value of an array field. The semantics of sorting by an array in MongoDB are somewhat surprising in that the value of the sort key depends on the query predicate. Consider a simple collection with no indices that has the following two documents:
Now consider the following query which sorts on the array a:
Why does this sort the document with _id 1 before the document with _id 2? The answer is due to how the MongoDB query engine generates sort keys. The engine supposes that there is an index {a: 1}. It then generates the index bounds that this query would scan on this hypothetical index—namely, the interval [4, +Infinity]. The sort key for a document is the first in-range array element. For doc _id 1, this element is "4"; for _id 2, it is "5". Since 4 is less than 5, doc 1 precedes doc 2. You can see the sort keys that the system is generating in 3.2.x versions using a new feature called sortKey $meta projection:
Let's consider one more example using the same collection:
The only thing that we changed is the query predicate, but you can see that the sort key generation algorithm yields different sort keys, and hence the resulting sort order is different. Note also that we are evaluating an open proposal to change the semantics of sorting on an array: see What changed from 2.6?The behavior change you observed is due to the fix for Since your queries include regular expressions with the "|" character, they will end up generating loose index bounds. Specifically, the index bounds will include all strings. This means that when the application-requested sort order is {"Keys.Queryables": 1}, the sort key for a particular document will simply be the smallest string in the Keys.Querables array. Similarly, when the sort is {"Keys.Queryables": -1}, the sort key for a document will be the largest string in the Keys.Querables array. Augmenting your queries with a sortKey $meta projection shows this to be the case. The sort key for each document is contained in the sortKey field in the output below:
Next stepsI am closing this ticket as Works as Designed, since I don't see evidence of a bug in the system. Please watch/vote for As a more immediate workaround, I think you should be able to use the aggregation system to obtain the desired behavior. If the result set is small, another workaround would be to compute the desired sort on the client side. Best, | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Ramon Fernandez Marina [ 26/Feb/16 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Thanks for your report itwebtf@saxobank.com, I'm able to reproduce the behavior you describe and we're investigating. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by ITWEBTF SAXOBANK [ 26/Feb/16 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Correction, I meant that it works on 2.6.11. |