[SERVER-1205] $or sort does not use index ranges expressed in or clauses Created: 08/Jun/10  Updated: 07/Apr/23  Resolved: 10/Jan/14

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: None
Fix Version/s: 2.5.5

Type: Improvement Priority: Major - P3
Reporter: Aaron Staple Assignee: hari.khalsa@10gen.com
Resolution: Done Votes: 115
Labels: query_triage
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File server1205.js    
Issue Links:
Depends
Duplicate
is duplicated by SERVER-7327 Using "OR" in a query with ordering r... Closed
is duplicated by SERVER-2057 $or with sort Closed
Related
related to SERVER-13618 Optimization for sorted $in queries n... Closed
related to SERVER-12024 re-implement $in in terms of $or Backlog
related to DOCS-5804 Add example for $or case to the sorti... Closed
related to SERVER-8790 Introduce composable "stages" in quer... Closed
Participants:
Case:

 Description   

Decide whether or not this is necessary and if it is then implement.



 Comments   
Comment by Edward [ 27/Apr/14 ]

I have created an issue with explaining the above: https://jira.mongodb.org/browse/SERVER-13754

Comment by Asya Kamsky [ 27/Apr/14 ]

Edouard: please include data, query and explain with these kinds of reports otherwise it's very difficult to validate what works like it's supposed to and what doesn't.

Comment by Edward [ 27/Apr/14 ]

It seems that at current version (2.6.1), the query optimizer is still unable to optimize queries that use $or, $in, limit() and sort() all at once. The https://jira.mongodb.org/browse/SERVER-1205 and https://jira.mongodb.org/browse/SERVER-3310 fixes, each only improved performance on queries having 3 out of the 4 operations listed above. When introducing a 4th operation into the query, the optimization goes out the window. This behavior is observed with full index and document scans within the $or clause, even though a limit() is specified.

Comment by Githook User [ 17/Apr/14 ]

Author:

{u'username': u'dstorch', u'name': u'David Storch', u'email': u'david.storch@10gen.com'}

Message: SERVER-1205 add test for explode for sort with meta sort spec
Branch: master
https://github.com/mongodb/mongo/commit/5c486649ce4f237db01942e48ea8683489bac8ca

Comment by Oleg Slobodskoi [ 14/Feb/14 ]

Hopefully this will be released soon, as it is a blocker issue for me too:

https://groups.google.com/forum/#!topic/mongodb-user/Vbmp3_dtarY

Thanks to Andrew Ryder who has explained this.

Comment by Githook User [ 10/Jan/14 ]

Author:

{u'username': u'hkhalsa', u'name': u'Hari Khalsa', u'email': u'hkhalsa@10gen.com'}

Message: SERVER-1205 explode point interval scans to obtain sort
Branch: master
https://github.com/mongodb/mongo/commit/9cfca59caf6426bf3c14fa3730aa58003b223386

Comment by Silvestrs Kante [ 31/Dec/13 ]

+1 This issue is causing lot of workarounds to our solution.

Comment by Oran Ben-Zur [ 10/Nov/13 ]

+1, critical for the entire developer team in my company.

Comment by Paul Klimashkin [ 10/Nov/13 ]

+1, Faced with the same problem. Very critical
Really can wait in 2.5.4?

Comment by Subramanya HS [ 04/Oct/13 ]

+1, we are facing lot of challenges with this bug

Comment by Srinivas R [ 04/Oct/13 ]

+1
if someone using mongoose (see: mongoose-when)
currently we use promises with multiple queries which asynchronously executes and wait for all to finish and apply some application logic for post processing.

Comment by Jonathan Gaillard [ 26/Aug/13 ]

+1 for this need as we make heavy use of these queries !

Comment by auto [ 19/Jul/13 ]

Author:

{u'username': u'hkhalsa', u'name': u'Hari Khalsa', u'email': u'hkhalsa@10gen.com'}

Message: SERVER-10026 SERVER-10192 SERVER-1205 sorting stages
Branch: master
https://github.com/mongodb/mongo/commit/cc54cdf3aa4a42308d135a393bf736aeb58d89d1

Comment by Eliot Horowitz (Inactive) [ 08/May/13 ]

The current fixVersion is 2.5.w (see above), meaning we're trying to get it into 2.6.
We're doing a lot of work on the query optimizer to make things of this nature easier.
Once that is done, this, and many related tickets will get resolved.

Comment by Petr Gottvald [ 07/May/13 ]

Common guys ! It's been 2 years since this issue was opened ! What is the current state on this ? Are you really going to fix this or not ?

Comment by Steve McMillen [ 28/Apr/13 ]

This bug will cost our organization a deal of time to develop around this deficiency. Can someone at 10gen please indicate if this is going to be resolved and when so we all don't have to keep guessing at if we should spend the effort to work around this problem or wait?

Comment by Johan Hedin [ 07/Apr/13 ]

Fully agree with the comments above. One vote from me as well and hoping to see this in 2.6.

Comment by Juheejin [ 28/Mar/13 ]

Really big issue for us... +1 for a fix.

Comment by Roy [ 15/Mar/13 ]

Really big issue for us... +1 for a fix.

Comment by Noel Lynagh [ 25/Feb/13 ]

We're in the process of switching back to RDBMS for our activity stream because of this. Quite disappointed as Mongo is the perfect tool in every other area for us.

Comment by Adam Baldwin [ 25/Feb/13 ]

So, which is more efficient, using the indices combined from the "or" or the index from the "sort"? Agreed, this is a major weakness compared to SQL DB's!

Comment by janson chen [ 18/Feb/13 ]

This bug has very long time.I need use or and sort query together in my job. I want you can solve it as soon as possible. Thanks!

Comment by Daniel Pasette (Inactive) [ 30/Jan/13 ]

10gen definitely considers this a priority and a fix is currently scheduled as one of the top priorities for the 2.6 release. We are currently making major refactor to the query planner (matcher) and optimizer which will include a fix for this issue. However, it was not easy to make this fix with the current architecture in advance of the fundamental code changes that are currently underway.

-Dan

Comment by Carlos Rodriguez [ 29/Jan/13 ]

Bugs like this have caused me to abandon MongoDB. I consider it a "legacy" part of my production infrastructure now.

How many years does a "Major" issue have to be open before 10gen gives a #*!&?

Comment by Daniel Gibbons [ 29/Jan/13 ]

+1 for fix. My workaround was to issue 2 separate sorted queries then merge the results in application logic.

Comment by drapeko [ 29/Jan/13 ]

+1 for a fix. spent significant amount of time trying to understand what's wrong

Comment by Noel Lynagh [ 24/Jan/13 ]

+1 for a fix. I first encountered this in summer 2011. The workaround I created then (10 separate queries with a huge limit and then combining the results in PHP) has now been outgrown.

Comment by Andreas Bernhardsson [ 24/Dec/12 ]

To bad that this was pushed back, I was really hoping or this in the next release. It would greatly cut down the amount of data I need to send back to the application from MongoDB. Now have to make about 8 queries with limit 20 and then sort and decide which 20 documents to use, instead of just letting MongoDB return the 20 documents.

I really hope you reconsider and include it in 2.3.x, It's all I really want for Christmas. =)

Comment by Solomon Wu [ 30/Nov/12 ]

This type of query is used in an important functionality (feed style, sorted by time).

Comment by Ivaylo Sharkov [ 29/Nov/12 ]

+1 to be fixed ASAP

Comment by Ivan Kotev [ 23/Nov/12 ]

Guys,
Will you able to make it in 2.3.1?
This is very critical for our activity feed performance optimization where most of the query (and result eliminating) parameters are within an $or clause.
Without this - the larger the collection is, the worse the performance is. It's just a matter of time to reach an unacceptable response time when the end user is trying to load its activity feed.

Comment by Patrick Neff [ 09/Oct/12 ]

I agree this needs to be fixed. Before I found this I thought I was just being stupid for a few weeks.

Comment by Ashish Trivedi [ 12/Sep/12 ]

Needs fix very soon. Its preventing very important functionality.

Comment by liugen [ 07/Sep/12 ]

Please fix it ASAP. Thanks!!!

Comment by elsonwu [ 30/Jul/12 ]

@Montse Medina
I do agree with you. Please fix it ASAP. Thanks!!!

Comment by Burcu Dogan [ 14/Jun/12 ]

Either fix it or document it loudly. At least devs should be aware of the need for an alternative filtering mechanism without querying with $ors.

Comment by Tao Chen [ 14/Jun/12 ]

We have the same problem. Will this be fixed in ver2.2 in July?

Comment by Venkata Ramanan [ 15/May/12 ]

Badly needing this to be fixed. Without this mongodb looks not so fit for report related applications.

Comment by Glen Xiao [ 11/May/12 ]

+1 vote. It does not work well even if "hint" an index. Unnecessary index entries and objects are scanned. It should work better.

Comment by Helmut Zechmann [ 22/Feb/12 ]

+1 for this beeing necessary from me too.
From my point of view, this is a basic feature i would like to use in my projects.
The fact that no index is beeing used for or/sort combinations makes it necessary to implement client side sorting.
Of course this is possible, but I think, the database should be able to do this.

Comment by Matthias Felsche [ 21/Feb/12 ]

+1 for this being necessary.

This is a basic query-use-case we really need.
Furthermore, with that change made, the behaviour produced by the API would be a bit more predictable.

Nonetheless, great work! I really enjoy using mongodb (except this issue).

Comment by Aaron Staple [ 19/Jan/12 ]

@arie - The issue here isn't the choice of indexes, but the fact that we can't currently sort results returned by a sequence of indexes instead of a single index (which is the way $or works). Thus even if you could hint a multiple index plan we'd need to sort the results - which is the bulk of the work here.

Comment by Arie Grapa [ 19/Jan/12 ]

guys, if you are not fixing the optimizer before 2.2 to make the right choice of plans, could you at least expose some mechanism, similar to "hint", so we can tell the optimizer which plan to use?

Comment by Jordan Berg [ 29/Nov/11 ]

Another vote on this being a pain! Thanks for all the great work mongo team.

Comment by Aaron Staple [ 24/Nov/11 ]

Michael - the change you have proposed has the effect of dropping the sort spec in some cases. You may get index usage in these cases, but the results may be ordered incorrectly. There is a high likelihood this feature will be implemented in 2.2.

Comment by Michael Barrientos [ 22/Nov/11 ]

This bug prevents us from implementing very basic features of our product, like search pagination because sorting by date absolutely kills performance, and workarounds are all performance killing as well. Because this is so painful, I started looking at the code. In db/queryoptimizer.cpp, line 858 currently looks like this:

if ( !order.isEmpty() || !min.isEmpty() || !max.isEmpty() ) {

Making the following change made it no longer do full table scans for at least some of the cases that affect our work:

if ( !min.isEmpty() || !max.isEmpty() ) {

I'm afraid to make this change by itself, since specifically banning sorts in the code suggests there's something downstream that assumes sort-order is empty. (i.e. it's a feature, not a bug).

Aaron, according to git blame, you're the only one touching this section of code. Is there any insight you can give into the reason those lines exist, or is it safe to just remove that? (It was last substantially touched in the fix for SERVER-1206).

Comment by Montse Medina [ 17/Nov/11 ]

This is extremely painful for us. This destroys the our performance. It's very critical that Mongo fixes this issue as soon as possible, as $or clauses are very frequent in our query load.
Please, fix in 2.1!

Comment by Swaroop Bhandary [ 05/Sep/11 ]

We are also facing a lot of problems. I fail to understand why would the sort not use the existing indexes. All sort are resulting in full table scans. Kindly schedule this bug fix on priority.

Comment by Thomas DeMille [ 25/Aug/11 ]

@Carlos Yes, we are having to devise hacky kind of workarounds as well, and abandon some functionality because of this...

Comment by Carlos Rodriguez [ 25/Aug/11 ]

If this is technically feasible to fix, it would be very helpful.

In my app, we have a prominent Facebook-like news feed which sorts by timestamp -1, and fetches activity related to the user's friends, groups, events, etc. There are many $or pieces, which combined with the sort would make the query unusable performance-wise due to this bug.

I've devised a workaround which involves a scatter-and-sort method. In PHP, each $or is broken into its own query, flattened with the non-$or conditions. This results in 12 or so indexed queries, limit 25 each. Then the results are combined and sorted back in PHP. Not ideal, it would be much better to have this done by mongo instead!

Comment by Thomas DeMille [ 22/Aug/11 ]

This is making a large number of my queries do full table scans.. The hint works, but I don't understand why the optimizer doesn't try that index and deduce that the index is MUCH faster?

Comment by Thomas DeMille [ 22/Aug/11 ]

I'm not sure I understand, if the query is {$or:[

{a:1}

,

{b:1}

]} ).sort(

{c:1}

and the index is

{a:1,b:1,c:1}

why wouldn't it choose that index?

Comment by Chad Remesch [ 15/Jul/11 ]

This has also caused us a lot of pain. We are current trying to come up with a workaround.

Comment by Petr Gottvald [ 21/Jun/11 ]

This has been making my life very painful lately. Can we pease get more votes on this issue ?

Comment by Ed Rooth [ 10/Feb/11 ]

This is preventing us from running the query that we would like to run (or query with a sort), so I'd just like to add my vote for a fix.

Comment by Eliot Horowitz (Inactive) [ 09/Jun/10 ]

this is probably less important

Comment by Aaron Staple [ 09/Jun/10 ]

So this means that if you do find( {$or:[

{a:1}

,

{b:1}

]} ).sort(

{c:1}

) we will either do a full scan of an index on c or a full table scan. The tricky case to implement for this is when the choice is between an in memory sort scan on {$or:[

{a:1}

,

{b:1}

]} and a scan on index

{c:1}

(possibly with a range limiter in a non or portion of the query) because we are comparing one option that is a sequence of query plans with an option that is just a single plan.

Even more difficult to implement would be potentially leveraging the existing sort order inside or clauses. For example if we wanted to do this without an in memory sort
find( {$or:[

{a:1}

,

{b:1}

]} ).sort(

{c:1}

) using indexes

{a:1,c:1}

and

{b:1,c:1}
Comment by Eliot Horowitz (Inactive) [ 09/Jun/10 ]

Not exactly sure what you mean

Generated at Thu Feb 08 02:56:22 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.