[SERVER-1205] $or sort does not use index ranges expressed in or clauses Created: 08/Jun/10 Updated: 07/Apr/23 Resolved: 10/Jan/14 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Querying |
| Affects Version/s: | None |
| Fix Version/s: | 2.5.5 |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Aaron Staple | Assignee: | hari.khalsa@10gen.com |
| Resolution: | Done | Votes: | 115 |
| Labels: | query_triage | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Description |
|
Decide whether or not this is necessary and if it is then implement. |
| Comments |
| Comment by Edward [ 27/Apr/14 ] |
|
I have created an issue with explaining the above: https://jira.mongodb.org/browse/SERVER-13754 |
| Comment by Asya Kamsky [ 27/Apr/14 ] |
|
Edouard: please include data, query and explain with these kinds of reports otherwise it's very difficult to validate what works like it's supposed to and what doesn't. |
| Comment by Edward [ 27/Apr/14 ] |
|
It seems that at current version (2.6.1), the query optimizer is still unable to optimize queries that use $or, $in, limit() and sort() all at once. The https://jira.mongodb.org/browse/SERVER-1205 and https://jira.mongodb.org/browse/SERVER-3310 fixes, each only improved performance on queries having 3 out of the 4 operations listed above. When introducing a 4th operation into the query, the optimization goes out the window. This behavior is observed with full index and document scans within the $or clause, even though a limit() is specified. |
| Comment by Githook User [ 17/Apr/14 ] |
|
Author: {u'username': u'dstorch', u'name': u'David Storch', u'email': u'david.storch@10gen.com'}Message: |
| Comment by Oleg Slobodskoi [ 14/Feb/14 ] |
|
Hopefully this will be released soon, as it is a blocker issue for me too: https://groups.google.com/forum/#!topic/mongodb-user/Vbmp3_dtarY Thanks to Andrew Ryder who has explained this. |
| Comment by Githook User [ 10/Jan/14 ] |
|
Author: {u'username': u'hkhalsa', u'name': u'Hari Khalsa', u'email': u'hkhalsa@10gen.com'}Message: |
| Comment by Silvestrs Kante [ 31/Dec/13 ] |
|
+1 This issue is causing lot of workarounds to our solution. |
| Comment by Oran Ben-Zur [ 10/Nov/13 ] |
|
+1, critical for the entire developer team in my company. |
| Comment by Paul Klimashkin [ 10/Nov/13 ] |
|
+1, Faced with the same problem. Very critical |
| Comment by Subramanya HS [ 04/Oct/13 ] |
|
+1, we are facing lot of challenges with this bug |
| Comment by Srinivas R [ 04/Oct/13 ] |
|
+1 |
| Comment by Jonathan Gaillard [ 26/Aug/13 ] |
|
+1 for this need as we make heavy use of these queries ! |
| Comment by auto [ 19/Jul/13 ] |
|
Author: {u'username': u'hkhalsa', u'name': u'Hari Khalsa', u'email': u'hkhalsa@10gen.com'}Message: |
| Comment by Eliot Horowitz (Inactive) [ 08/May/13 ] |
|
The current fixVersion is 2.5.w (see above), meaning we're trying to get it into 2.6. |
| Comment by Petr Gottvald [ 07/May/13 ] |
|
Common guys ! It's been 2 years since this issue was opened ! What is the current state on this ? Are you really going to fix this or not ? |
| Comment by Steve McMillen [ 28/Apr/13 ] |
|
This bug will cost our organization a deal of time to develop around this deficiency. Can someone at 10gen please indicate if this is going to be resolved and when so we all don't have to keep guessing at if we should spend the effort to work around this problem or wait? |
| Comment by Johan Hedin [ 07/Apr/13 ] |
|
Fully agree with the comments above. One vote from me as well and hoping to see this in 2.6. |
| Comment by Juheejin [ 28/Mar/13 ] |
|
Really big issue for us... +1 for a fix. |
| Comment by Roy [ 15/Mar/13 ] |
|
Really big issue for us... +1 for a fix. |
| Comment by Noel Lynagh [ 25/Feb/13 ] |
|
We're in the process of switching back to RDBMS for our activity stream because of this. Quite disappointed as Mongo is the perfect tool in every other area for us. |
| Comment by Adam Baldwin [ 25/Feb/13 ] |
|
So, which is more efficient, using the indices combined from the "or" or the index from the "sort"? Agreed, this is a major weakness compared to SQL DB's! |
| Comment by janson chen [ 18/Feb/13 ] |
|
This bug has very long time.I need use or and sort query together in my job. I want you can solve it as soon as possible. Thanks! |
| Comment by Daniel Pasette (Inactive) [ 30/Jan/13 ] |
|
10gen definitely considers this a priority and a fix is currently scheduled as one of the top priorities for the 2.6 release. We are currently making major refactor to the query planner (matcher) and optimizer which will include a fix for this issue. However, it was not easy to make this fix with the current architecture in advance of the fundamental code changes that are currently underway. -Dan |
| Comment by Carlos Rodriguez [ 29/Jan/13 ] |
|
Bugs like this have caused me to abandon MongoDB. I consider it a "legacy" part of my production infrastructure now. How many years does a "Major" issue have to be open before 10gen gives a #*!&? |
| Comment by Daniel Gibbons [ 29/Jan/13 ] |
|
+1 for fix. My workaround was to issue 2 separate sorted queries then merge the results in application logic. |
| Comment by drapeko [ 29/Jan/13 ] |
|
+1 for a fix. spent significant amount of time trying to understand what's wrong |
| Comment by Noel Lynagh [ 24/Jan/13 ] |
|
+1 for a fix. I first encountered this in summer 2011. The workaround I created then (10 separate queries with a huge limit and then combining the results in PHP) has now been outgrown. |
| Comment by Andreas Bernhardsson [ 24/Dec/12 ] |
|
To bad that this was pushed back, I was really hoping or this in the next release. It would greatly cut down the amount of data I need to send back to the application from MongoDB. Now have to make about 8 queries with limit 20 and then sort and decide which 20 documents to use, instead of just letting MongoDB return the 20 documents. I really hope you reconsider and include it in 2.3.x, It's all I really want for Christmas. =) |
| Comment by Solomon Wu [ 30/Nov/12 ] |
|
This type of query is used in an important functionality (feed style, sorted by time). |
| Comment by Ivaylo Sharkov [ 29/Nov/12 ] |
|
+1 to be fixed ASAP |
| Comment by Ivan Kotev [ 23/Nov/12 ] |
|
Guys, |
| Comment by Patrick Neff [ 09/Oct/12 ] |
|
I agree this needs to be fixed. Before I found this I thought I was just being stupid for a few weeks. |
| Comment by Ashish Trivedi [ 12/Sep/12 ] |
|
Needs fix very soon. Its preventing very important functionality. |
| Comment by liugen [ 07/Sep/12 ] |
|
Please fix it ASAP. Thanks!!! |
| Comment by elsonwu [ 30/Jul/12 ] |
|
@Montse Medina |
| Comment by Burcu Dogan [ 14/Jun/12 ] |
|
Either fix it or document it loudly. At least devs should be aware of the need for an alternative filtering mechanism without querying with $ors. |
| Comment by Tao Chen [ 14/Jun/12 ] |
|
We have the same problem. Will this be fixed in ver2.2 in July? |
| Comment by Venkata Ramanan [ 15/May/12 ] |
|
Badly needing this to be fixed. Without this mongodb looks not so fit for report related applications. |
| Comment by Glen Xiao [ 11/May/12 ] |
|
+1 vote. It does not work well even if "hint" an index. Unnecessary index entries and objects are scanned. It should work better. |
| Comment by Helmut Zechmann [ 22/Feb/12 ] |
|
+1 for this beeing necessary from me too. |
| Comment by Matthias Felsche [ 21/Feb/12 ] |
|
+1 for this being necessary. This is a basic query-use-case we really need. Nonetheless, great work! I really enjoy using mongodb (except this issue). |
| Comment by Aaron Staple [ 19/Jan/12 ] |
|
@arie - The issue here isn't the choice of indexes, but the fact that we can't currently sort results returned by a sequence of indexes instead of a single index (which is the way $or works). Thus even if you could hint a multiple index plan we'd need to sort the results - which is the bulk of the work here. |
| Comment by Arie Grapa [ 19/Jan/12 ] |
|
guys, if you are not fixing the optimizer before 2.2 to make the right choice of plans, could you at least expose some mechanism, similar to "hint", so we can tell the optimizer which plan to use? |
| Comment by Jordan Berg [ 29/Nov/11 ] |
|
Another vote on this being a pain! Thanks for all the great work mongo team. |
| Comment by Aaron Staple [ 24/Nov/11 ] |
|
Michael - the change you have proposed has the effect of dropping the sort spec in some cases. You may get index usage in these cases, but the results may be ordered incorrectly. There is a high likelihood this feature will be implemented in 2.2. |
| Comment by Michael Barrientos [ 22/Nov/11 ] |
|
This bug prevents us from implementing very basic features of our product, like search pagination because sorting by date absolutely kills performance, and workarounds are all performance killing as well. Because this is so painful, I started looking at the code. In db/queryoptimizer.cpp, line 858 currently looks like this: if ( !order.isEmpty() || !min.isEmpty() || !max.isEmpty() ) { Making the following change made it no longer do full table scans for at least some of the cases that affect our work: if ( !min.isEmpty() || !max.isEmpty() ) { I'm afraid to make this change by itself, since specifically banning sorts in the code suggests there's something downstream that assumes sort-order is empty. (i.e. it's a feature, not a bug). Aaron, according to git blame, you're the only one touching this section of code. Is there any insight you can give into the reason those lines exist, or is it safe to just remove that? (It was last substantially touched in the fix for |
| Comment by Montse Medina [ 17/Nov/11 ] |
|
This is extremely painful for us. This destroys the our performance. It's very critical that Mongo fixes this issue as soon as possible, as $or clauses are very frequent in our query load. |
| Comment by Swaroop Bhandary [ 05/Sep/11 ] |
|
We are also facing a lot of problems. I fail to understand why would the sort not use the existing indexes. All sort are resulting in full table scans. Kindly schedule this bug fix on priority. |
| Comment by Thomas DeMille [ 25/Aug/11 ] |
|
@Carlos Yes, we are having to devise hacky kind of workarounds as well, and abandon some functionality because of this... |
| Comment by Carlos Rodriguez [ 25/Aug/11 ] |
|
If this is technically feasible to fix, it would be very helpful. In my app, we have a prominent Facebook-like news feed which sorts by timestamp -1, and fetches activity related to the user's friends, groups, events, etc. There are many $or pieces, which combined with the sort would make the query unusable performance-wise due to this bug. I've devised a workaround which involves a scatter-and-sort method. In PHP, each $or is broken into its own query, flattened with the non-$or conditions. This results in 12 or so indexed queries, limit 25 each. Then the results are combined and sorted back in PHP. Not ideal, it would be much better to have this done by mongo instead! |
| Comment by Thomas DeMille [ 22/Aug/11 ] |
|
This is making a large number of my queries do full table scans.. The hint works, but I don't understand why the optimizer doesn't try that index and deduce that the index is MUCH faster? |
| Comment by Thomas DeMille [ 22/Aug/11 ] |
|
I'm not sure I understand, if the query is {$or:[ {a:1}, {b:1}]} ).sort( {c:1}and the index is {a:1,b:1,c:1}why wouldn't it choose that index? |
| Comment by Chad Remesch [ 15/Jul/11 ] |
|
This has also caused us a lot of pain. We are current trying to come up with a workaround. |
| Comment by Petr Gottvald [ 21/Jun/11 ] |
|
This has been making my life very painful lately. Can we pease get more votes on this issue ? |
| Comment by Ed Rooth [ 10/Feb/11 ] |
|
This is preventing us from running the query that we would like to run (or query with a sort), so I'd just like to add my vote for a fix. |
| Comment by Eliot Horowitz (Inactive) [ 09/Jun/10 ] |
|
this is probably less important |
| Comment by Aaron Staple [ 09/Jun/10 ] |
|
So this means that if you do find( {$or:[ {a:1}, {b:1}]} ).sort( {c:1}) we will either do a full scan of an index on c or a full table scan. The tricky case to implement for this is when the choice is between an in memory sort scan on {$or:[ {a:1}, {b:1}]} and a scan on index {c:1}(possibly with a range limiter in a non or portion of the query) because we are comparing one option that is a sequence of query plans with an option that is just a single plan. Even more difficult to implement would be potentially leveraging the existing sort order inside or clauses. For example if we wanted to do this without an in memory sort , {b:1}]} ).sort( {c:1}) using indexes {a:1,c:1}and {b:1,c:1} |
| Comment by Eliot Horowitz (Inactive) [ 09/Jun/10 ] |
|
Not exactly sure what you mean |