[SERVER-3435] Count is even slower than find(...)->sort(...)->limit(...) Created: 17/Jul/11 Updated: 29/Aug/11 Resolved: 18/Jul/11 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Performance, Querying |
| Affects Version/s: | 1.8.2 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Tim van Elsloo | Assignee: | Unassigned |
| Resolution: | Done | Votes: | 0 |
| Labels: | performance, query | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Mac OS X 10.6.8 2.16 GHz |
||
| Operating System: | ALL |
| Participants: |
| Description |
|
I'm using a dummy collection of 18M rows in the following format: { _id: __unique_of_course__, name: __also_unique__, dummyID: __not_unique__ }There's an additional index on dummyID: ); Now, when I query it ... ); This will take several seconds .... Now, when I query the following ... ).sort( {dummyID: -1}).limit(10); This takes just one ms or so ... Is this some bug, or did I just forgot something? It'ld make sense if sorting a 18M result-set took more time than counting it ... (note: it may be faster when testing it on a real server, I'm testing it on my MacBook right now. Look at the relative difference between the benchmarks). |
| Comments |
| Comment by Eliot Horowitz (Inactive) [ 18/Jul/11 ] |
|
For count you are correct. For sorting, you are only correct if there isn't an index. If there is in an index, it can scan the collection in the correct order, so only has to look at 10 records, not N |
| Comment by Tim van Elsloo [ 18/Jul/11 ] |
|
@Eliot, thank you for your response Correct me if I'm wrong, but this is what `count` does, right (in basic, it's probably much harder): 1. Loop trough every item. So, indeed, it does need to scan all the matches. Now, (what I think) what's going on when `sort`ing. 1. Create a first temporary item-list. If I'm correct, wouldn't it make more sense that `sort`-ing took more time than `count`-ing? – As a temporary solution I've created a new collection, which contains the counters so I don't need to use the `count`-command anymore (just findOne( {someKey: someValue}).count). |
| Comment by Eliot Horowitz (Inactive) [ 18/Jul/11 ] |
|
The query itself seems perfectly indexed, so should be very fast to return the top 10. |
| Comment by Tim van Elsloo [ 17/Jul/11 ] |
|
Oh, I forgot to say that I'm using `mongos`, I haven't tested it using `mongod` yet. However, I don't think that would make a big difference, since the difference between those benchmarks is larger than just 2 queries in row. As I said, it's a dummy-environment, so I'm running 2 mongod-procs, 1 mongod-config-proc and 1 mongos-proc add the same address (localhost). |