It looks like the $near query performance is inconsistent based on the data set and query parameter.
It is very easy to reproduce, I created a test database using the restaurants dataset from the $near docs
https://docs.mongodb.com/v3.2/tutorial/geospatial-tutorial/
https://raw.githubusercontent.com/mongodb/docs-assets/geospatial/restaurants.json
I imported this dataset in a collection, and created an index on the `location` attribute.
In a mongo shell if I run this query with explain
//near mexico db.restaurants.find({ location: { $near: { $geometry: { type: "Point", coordinates: [ -109.16015625, 17.5602465032949 ] } } }}).limit(1).explain("executionStats")
this is the result I get
"executionStats" : { "executionSuccess" : true, "nReturned" : NumberInt(1), "executionTimeMillis" : NumberInt(203), "totalKeysExamined" : NumberInt(25344), "totalDocsExamined" : NumberInt(25340),
if I run this query
//near Denver CO db.restaurants.find({ location: { $near: { $geometry: { type: "Point", coordinates: [ -104.9955372, 39.7642543 ] } } }}).limit(1).explain("executionStats")
then I get
"executionStats" : { "executionSuccess" : true, "nReturned" : NumberInt(1), "executionTimeMillis" : NumberInt(2), "totalKeysExamined" : NumberInt(7), "totalDocsExamined" : NumberInt(4),
Both queries are identical with a limit set to 1 except for the coordinates in the query, but one scans all the objects and takes 200ms (if far from the locations in the dataset), and the other one scans just a few objects and takes 2ms (if within the US).
Am I doing something wrong?
I was able to witness this on both WiredTiger and MMAP with MongoDB 3.2.10
Not sure if it happens with earlier versions
- related to
-
SERVER-26518 Include density estimator in geoNear explain output
- Backlog
-
SERVER-18426 $geoNear expands aggressively if the centroid is far from the dense data
- Backlog