[SERVER-5063] $in on first compound key element and limit efficiency Created: 24/Feb/12 Updated: 28/Oct/15 Resolved: 09/Apr/12 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Querying |
| Affects Version/s: | None |
| Fix Version/s: | 2.1.1 |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Eliot Horowitz (Inactive) | Assignee: | Aaron Staple |
| Resolution: | Done | Votes: | 2 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||
| Description |
|
If you have an index on ( a , b ) and do a query like find( { a : { $in : [ ... }} ).sort( { b : 1 } ).limit(10) |
| Comments |
| Comment by Aaron Staple [ 30/Jun/12 ] |
|
Hi Artur, No problem. It looks like your 'feelings_1_location.state_1_created_at_-1' index is multikey, which would explain why the optimization is not being used. (It is not supported for multikey indexes.) |
| Comment by Artur Rodrigues [ 30/Jun/12 ] |
|
Hi Aaron, Thanks for your prompt reply. Please find the information you asked for: v2.1.2 Indeed, as you said, v2.1 is reporting the nscanned values of all candidates. My fault for not reading the changelog throughly. Anyways, I was expecting a performance boost in v2.1 for this specific query given the compound index, but I think I'll have to wait until Thank you once again! |
| Comment by Aaron Staple [ 30/Jun/12 ] |
|
Hi Artur, I think you may be encountering a difference the explain behavior between 2.0 and 2.1. In 2.1 the nscanned reported for the overall query includes the nscanned values of all candidate plans attempted. In 2.0 only the nscanned of the "winning" query plan is reported. Could you please send the output of: db.dev06.find( ... ).explain( true ) ).explain( true ) for both 2.0 and 2.1? Also, it would be helpful for clarity if you could include the command you enter into the shell with the shell output you post. Thanks, |
| Comment by Artur Rodrigues [ 30/Jun/12 ] |
|
I would just like to share my experience, where I have the following indexes: "indexSizes" : { The query: , }).sort( {'created_at': -1}).limit(10).explain(); In v2.0.6 the results were: while in v2.1.2 there was a significant increase in the number of objects scanned: Even if I hint in v.2.0.6 the same index used in v2.1.2, it still scans less documents: |
| Comment by Aaron Staple [ 09/Apr/12 ] |
|
Hi James, The performance improvement you'll see will be very dependent on your data distribution, and it's possible to generate a benchmark with an arbitrary improvement multiple by varying the amount of data in your test. (For example I just ran a simple test for this optimization that showed a four order of magnitude performance improvement.) Is there a real world example data set you'd like to test? |
| Comment by James Smith [ 09/Apr/12 ] |
|
Out of interest, what was the speed difference for $in queries of this type after this change, do you have any benchmark results? |
| Comment by auto [ 07/Apr/12 ] |
|
Author: {u'login': u'astaple', u'name': u'Aaron', u'email': u'aaron@10gen.com'}Message: |