Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-26974

Poor 2dsphere / $near performance and inconsistent object scanning

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Works as Designed
    • Affects Version/s: 3.2.10
    • Fix Version/s: None
    • Component/s: Geo, Querying
    • Labels:
      None
    • Operating System:
      ALL
    • Sprint:
      Query 2017-01-23

      Description

      It looks like the $near query performance is inconsistent based on the data set and query parameter.

      It is very easy to reproduce, I created a test database using the restaurants dataset from the $near docs

      https://docs.mongodb.com/v3.2/tutorial/geospatial-tutorial/
      https://raw.githubusercontent.com/mongodb/docs-assets/geospatial/restaurants.json

      I imported this dataset in a collection, and created an index on the `location` attribute.

      In a mongo shell if I run this query with explain

      //near mexico
      db.restaurants.find({ location: { $near: { $geometry: { type: "Point", coordinates: [ -109.16015625, 17.5602465032949 ] } } }}).limit(1).explain("executionStats")
      

      this is the result I get

      "executionStats" : {
              "executionSuccess" : true, 
              "nReturned" : NumberInt(1), 
              "executionTimeMillis" : NumberInt(203), 
              "totalKeysExamined" : NumberInt(25344), 
              "totalDocsExamined" : NumberInt(25340), 
      

      if I run this query

      //near Denver CO
      db.restaurants.find({ location: { $near: { $geometry: { type: "Point", coordinates: [ -104.9955372, 39.7642543 ] } } }}).limit(1).explain("executionStats")
      

      then I get

      "executionStats" : {
              "executionSuccess" : true, 
              "nReturned" : NumberInt(1), 
              "executionTimeMillis" : NumberInt(2), 
              "totalKeysExamined" : NumberInt(7), 
              "totalDocsExamined" : NumberInt(4), 
      

      Both queries are identical with a limit set to 1 except for the coordinates in the query, but one scans all the objects and takes 200ms (if far from the locations in the dataset), and the other one scans just a few objects and takes 2ms (if within the US).

      Am I doing something wrong?

      I was able to witness this on both WiredTiger and MMAP with MongoDB 3.2.10
      Not sure if it happens with earlier versions

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              david.storch David Storch
              Reporter:
              fallanic Fabien Allanic
              Participants:
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: