[SERVER-1026] check $in speed Created: 16/Apr/10  Updated: 12/Jul/16  Resolved: 22/Jun/10

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: None
Fix Version/s: 1.5.4

Type: Question Priority: Major - P3
Reporter: Eliot Horowitz (Inactive) Assignee: Aaron Staple
Resolution: Done Votes: 7
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
is depended on by SERVER-802 Query optimization is doing a big sca... Closed
Participants:

 Description   

we should make sure its as fast as possible, and doing sane things



 Comments   
Comment by auto [ 22/Jun/10 ]

Author:

{'login': 'astaple', 'name': 'Aaron', 'email': 'aaron@10gen.com'}

Message: SERVER-1026 limit combinatorial in bounds
http://github.com/mongodb/mongo/commit/5f3b74a454e375a12efa198b9eb81e7baa4fac4a

Comment by Aaron Staple [ 22/Jun/10 ]

200 seems kind of low - you can have a single $in array with over 200 elements, and I imagine we probably want to support that. (I saw an example in the mailing list where someone wanted to get all of 10k or 100k ids in one query.) I'm going to make the limit 1 million, please let me know if we should change.

Comment by Aaron Staple [ 22/Jun/10 ]

Ok I'll do that, thanks

Comment by Eliot Horowitz (Inactive) [ 22/Jun/10 ]

We can put a sane limit on it.
Maybe 200 ranges?
As long as the error message is clear - we can tweak later.

Comment by Aaron Staple [ 22/Jun/10 ]

I think that implicit limit is pretty high. For example, we can clearly fit 10 fields having $in clauses with 10 elements each in a 4mb query, but that generates 10^10 index bounds which is more than we can really handle.

Comment by Eliot Horowitz (Inactive) [ 22/Jun/10 ]

There is an implicit limit since the query has to be a a valid bson object.
So I think in this case can leave up to the developer to avoid horrible cases.

Comment by Aaron Staple [ 22/Jun/10 ]

@eliot - with this implementation, a user can potentially generate a number of index bounds that is exponential in the number of fields in the index. For example, find( {a:{$in:[0,1]},b:{$in:[0,1]},c:{$in:[0,1]}, ... } ). Currently those index bounds all need to go in memory up front. We could potentially define the bounds implicitly and generate them as needed (smop), but in that case we'd still need to loop over all of them.

Do we want to place any sort of limit on the number of field bounds, or just leave it up to the developer to avoid this situation?

Comment by auto [ 22/Jun/10 ]

Author:

{'login': 'astaple', 'name': 'Aaron', 'email': 'aaron@10gen.com'}

Message: SERVER-1026 handle stacked constraints from in clauses
http://github.com/mongodb/mongo/commit/155e38b679502bfee1a3caf5ca452667353074fc

Comment by Aaron Staple [ 14/Jun/10 ]

In particular if there are multiple $in constraints and a compound index, we should have a separate field range for each element in the set product of the in clauses.

Generated at Thu Feb 08 02:55:51 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.