[SERVER-13041] count queries that can be index only are slower in 2.6 Created: 05/Mar/14 Updated: 11/Jul/16 Resolved: 07/Mar/14 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Performance, Querying |
| Affects Version/s: | 2.6.0-rc0 |
| Fix Version/s: | 2.6.0-rc2 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Mark Callaghan | Assignee: | hari.khalsa@10gen.com |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||||||||||||
| Steps To Reproduce: | Create a 10M doc collection with 3 columns (_id, c1, c2). Values inserted were from (0,0,0), (1,1,1), .... to (9999999, 9999999, 9999999).
— Then create an index on c1...
— Then run these queries...
— then look at the mongod error log for slow query results. |
||||||||||||||||||||||||||||||
| Participants: |
| Description |
|
Count queries that can be index only are slower in 2.6 rc0 than in 2.4.9. For a 10M doc collection with an index on c1 I ran these queries. Queries 1,2 can use the PK on _id. Queries 2,3,4,5 can use the index on c1. Queries 6,7 should do a full scan.
I ran the set of queries twice and looked at times for the second run. All data is cached in RAM. For 2.4.9 the times are ~750 miillis each for the first 5 queries and then ~2.4 seconds for the last 2 queries. For 2.6 rc0 the times are ~2 seconds each for the first 5 queries and then ~2.9 seconds for the last 2 queries that should do a full scan. The index-only count queries are much slower in 2.6. Tull scan count queries are also slower (2.9 vs 2.4 seconds) but the difference is less significant. Alas, I can't do find().count().explain() Sorry for mixing javascript and python, I am new to javascript. I also don't have the latest version of pymongo. |
| Comments |
| Comment by hari.khalsa@10gen.com [ 07/Mar/14 ] | |||||||||||||||||||||||||||||
|
According to our benchmarks 2.6 should now be faster than 2.4. I'm going to resolve this ticket but do reopen if it's not solved for you. Thanks again for reporting it. | |||||||||||||||||||||||||||||
| Comment by Davide Italiano [ 06/Mar/14 ] | |||||||||||||||||||||||||||||
|
In my test case scenarios this commit definitely fixes the regression:
Also, this is faster than 2.4.9, results here for completeness:
FYI, to reproduce, get mongo-perf from github, build it and run
and wait for the output of Commands::CountsIntIDRange | |||||||||||||||||||||||||||||
| Comment by Mark Callaghan [ 06/Mar/14 ] | |||||||||||||||||||||||||||||
|
Thanks, will repeat tests on RC2 | |||||||||||||||||||||||||||||
| Comment by hari.khalsa@10gen.com [ 06/Mar/14 ] | |||||||||||||||||||||||||||||
|
Hi mdcallag, thanks for the heads up. I've submitted a change that should improve indexed count performance. To be honest, there was some superfluous code in the fast-count execution stage that was left-over from when the count stage was created by chipping away at the index scan stage. My change removes that cruft. I'll have more concrete numbers about the impact when our performance suites run internally. Feel encouraged to test the change out and report back here. Unfortunately this change didn't make the RC1 cutoff so you'll have to pull the latest code from master to test it yourself. Thanks again! | |||||||||||||||||||||||||||||
| Comment by Githook User [ 06/Mar/14 ] | |||||||||||||||||||||||||||||
|
Author: {u'username': u'hkhalsa', u'name': u'Hari Khalsa', u'email': u'hkhalsa@10gen.com'}Message: | |||||||||||||||||||||||||||||
| Comment by Mark Callaghan [ 05/Mar/14 ] | |||||||||||||||||||||||||||||
|
From explain output for that last batch of queries.
| |||||||||||||||||||||||||||||
| Comment by Mark Callaghan [ 05/Mar/14 ] | |||||||||||||||||||||||||||||
|
Replaced the .count() with a predicate on c2 that matches zero rows and repeated tests. In this case 2.6 did better.
And from the tail of the mongod error log the result for the 3rd and 5th queries are interesting to me because I have hinted the index scan and 2.6 is faster. That might be from a different change that avoids evaluating a predicate implied by the index scan.
| |||||||||||||||||||||||||||||
| Comment by Mark Callaghan [ 05/Mar/14 ] | |||||||||||||||||||||||||||||
|
Just learned not to paste code, reformatting kills it. Attached queries in one file and python for loading in another. |