[SERVER-4565] Count() takes exceptionally long time/hangs Created: 27/Dec/11 Updated: 06/Apr/23 Resolved: 17/Jan/12 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Querying |
| Affects Version/s: | 2.0.2 |
| Fix Version/s: | None |
| Type: | Question | Priority: | Minor - P4 |
| Reporter: | Brett Goldstein | Assignee: | Bill Hayward |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Red Hat Linux Enterprise |
||
| Attachments: |
|
| Participants: |
| Description |
|
Having a problem with following query: db.test.count({"location" : {"$within" : {"$center" : [[ -87, 41], 5]}}, "initial_type" : "x"}) I built an index for geospatial and the above call type. Coordinates are reduced for privacy reasons fyi... The find() returns exceptionally fast...but the count above just hangs. Any quick thoughts? thx |
| Comments |
| Comment by Richard Kreuter (Inactive) [ 28/Dec/11 ] |
|
Yes, please make a new issue for the separate topic. |
| Comment by Brett Goldstein [ 28/Dec/11 ] |
|
i have some questions about faults....should i open a new thread? |
| Comment by Brett Goldstein [ 28/Dec/11 ] |
|
got the diff between the count and find ...the issue was too much data within the search radius. what we are doing is lab testing mongodo to be the aggregation point of very large sets of disparate spatial data. the intent would be to run a mapping layer on top to pass a coordinate, data para, radius to return everything in that area. we may also layer on some real-time data...whether it goes to mongo first or comes from a geospatial SDE is TBD. make sense? |
| Comment by Richard Kreuter (Inactive) [ 28/Dec/11 ] |
|
@Brett: are you saying that find() returns a cursor immediately, or that iterating through all the documents that match is seemingly immediate? The count and the query both traverse a subset of the index to see what documents match the query, but count has to do so immediately, whereas the query only steps through the index as the client requests documents from the server by iterating the cursor. So if you're comparing the performance of var cnt = db.test.count(...); and var cur = db.test.find(...); then you're comparing very different things. Anyway, I'm still curious what you're trying to do with this geo query. To reiterate, that radius of 5 in the $within part of your query specifies a circle of about 300-350 miles around your center point. That's a pretty big space, relatively speaking; how much of your data set falls within that area? |
| Comment by Eliot Horowitz (Inactive) [ 28/Dec/11 ] |
|
Are these lat/long? |
| Comment by Brett Goldstein [ 27/Dec/11 ] |
|
ok...been playing with this. find() returns almost instantly...and count() does not (and i am just adding .count() to the end of the query. however, when i radically reduce the radius, the return improves dramatically. my first move was from 5 -> 1 and that did not impact much. However,moving from1 -> .001 has a 1 second return with count() = ~ 11k points. |
| Comment by Richard Kreuter (Inactive) [ 27/Dec/11 ] |
|
Attempt to reproduce problem described, with randomly-generated data. Possibly meaningless. |
| Comment by Richard Kreuter (Inactive) [ 27/Dec/11 ] |
|
Have you compared how long it takes to iterate through all the documents that match the find() to how long it takes for count() to return? In local testing (with a randomly generated data set but identical geospatial index), the count() takes a similar amount of time as iterating through all the documents from the find(). (Additionally, note that the geo part of your query searches 5 degrees of arc over a hypothetical great circle on the earth, or about a 350 mile radius around [-87, 41]. If you have a lot of points within that radius and if any specific value for initial_type is relatively sparsely distributed within that search space, the "haystack" index may be more suitable. See: http://www.mongodb.org/display/DOCS/Geospatial+Haystack+Indexing But it's also possible that the geo search isn't necessary at all. Could you say any more about what you're trying to find?) |
| Comment by Brett Goldstein [ 27/Dec/11 ] |
|
originally when i first saw the count() issue it was 24 million...now 44808486 |
| Comment by Dwight Merriman [ 27/Dec/11 ] |
|
what does db.test.count() return? |