[SERVER-18426] $geoNear expands aggressively if the centroid is far from the dense data Created: 12/May/15 Updated: 04/Jan/24 |
|
| Status: | Backlog |
| Project: | Core Server |
| Component/s: | Geo |
| Affects Version/s: | 3.0.2 |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Minor - P4 |
| Reporter: | Andrey Hohutkin | Assignee: | Backlog - Query Integration |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | qi-geo, query-44-grooming | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||
| Issue Links: |
|
||||||||||||
| Assigned Teams: |
Query Integration
|
||||||||||||
| Participants: | |||||||||||||
| Description |
|
I have a collection with 300,000 documents. In collection there is a 2dsphere index on 'geo' field. Here is a query I run: ); Stats of this query are weird: In the query I limit results to 500 documents. But in stats I see that mongoDB reads 80166 (objectsLoaded) documents from disk and only then cuts it out. |
| Comments |
| Comment by Siyuan Zhou [ 24/Jul/15 ] |
|
As brandon.zhang explained above, this is different from |
| Comment by Brandon Zhang [ 24/Jul/15 ] |
|
This behavior is due to the way geoNear expands its search. geoNear works by searching for documents in distance intervals successively farther from the centroid. At each interval, it will sort the documents by distance and return them to its parent stage. If the number of documents returned in an interval is less than 300, the next distance interval will double its range. |
| Comment by Sam Kleinman (Inactive) [ 20/May/15 ] |
|
Thanks for this report. Because geoNear returns sorted results, it starts from the geometry (i.e. point) specified in the query and selects all points that exist within a radius of the starting point. The operation sorts results, and if the result set requires additional results, it fetches all documents within a larger additional covering (a donut shaped surface). Because of the way that the geo indexes work, its possible for the query to fetch and examine the same document more than once during a single query. We were able to reproduce your results, and while this behavior and performance that is not desirable, it is expected given the current implementation. The work defined by Regards, |