[SERVER-1026] check $in speed Created: 16/Apr/10 Updated: 12/Jul/16 Resolved: 22/Jun/10 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Querying |
| Affects Version/s: | None |
| Fix Version/s: | 1.5.4 |
| Type: | Question | Priority: | Major - P3 |
| Reporter: | Eliot Horowitz (Inactive) | Assignee: | Aaron Staple |
| Resolution: | Done | Votes: | 7 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Participants: | |||||||||
| Description |
|
we should make sure its as fast as possible, and doing sane things |
| Comments |
| Comment by auto [ 22/Jun/10 ] |
|
Author: {'login': 'astaple', 'name': 'Aaron', 'email': 'aaron@10gen.com'}Message: |
| Comment by Aaron Staple [ 22/Jun/10 ] |
|
200 seems kind of low - you can have a single $in array with over 200 elements, and I imagine we probably want to support that. (I saw an example in the mailing list where someone wanted to get all of 10k or 100k ids in one query.) I'm going to make the limit 1 million, please let me know if we should change. |
| Comment by Aaron Staple [ 22/Jun/10 ] |
|
Ok I'll do that, thanks |
| Comment by Eliot Horowitz (Inactive) [ 22/Jun/10 ] |
|
We can put a sane limit on it. |
| Comment by Aaron Staple [ 22/Jun/10 ] |
|
I think that implicit limit is pretty high. For example, we can clearly fit 10 fields having $in clauses with 10 elements each in a 4mb query, but that generates 10^10 index bounds which is more than we can really handle. |
| Comment by Eliot Horowitz (Inactive) [ 22/Jun/10 ] |
|
There is an implicit limit since the query has to be a a valid bson object. |
| Comment by Aaron Staple [ 22/Jun/10 ] |
|
@eliot - with this implementation, a user can potentially generate a number of index bounds that is exponential in the number of fields in the index. For example, find( {a:{$in:[0,1]},b:{$in:[0,1]},c:{$in:[0,1]}, ... } ). Currently those index bounds all need to go in memory up front. We could potentially define the bounds implicitly and generate them as needed (smop), but in that case we'd still need to loop over all of them. Do we want to place any sort of limit on the number of field bounds, or just leave it up to the developer to avoid this situation? |
| Comment by auto [ 22/Jun/10 ] |
|
Author: {'login': 'astaple', 'name': 'Aaron', 'email': 'aaron@10gen.com'}Message: |
| Comment by Aaron Staple [ 14/Jun/10 ] |
|
In particular if there are multiple $in constraints and a compound index, we should have a separate field range for each element in the set product of the in clauses. |