Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-18724

findOne with $near force scan whole collection with dense map

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • 3.1.6
    • Affects Version/s: 3.0.3
    • Component/s: Geo
    • Labels:
      None
    • Fully Compatible
    • ALL

      Repro Steps:

      • create a map of (-0.005, -0.005) to (0.005, 0.005), which are a 1000m by 1000m area, loc is evenly distribution with 10m apart.
      • index with 2dshpere
      • run findOne from middle of the map, such as
        collection.find({loc: { $near: {$geometry: {type: "Point", coordinates:  [ 0.000329670329670329, 0.0009890109890109888 ] }}}} ).limit(1)
        

      log show this query scanned every doc in the collection, nscanned=10000

      2015-05-28T14:04:27.872-0700 I QUERY    [conn108] query test.test111 query: { loc: { $near: { $geometry: { type: "Point", coordinates: [ 0.000329670329670329, 0.0009890109890109888 ] } } } } planSummary: GEO_NEAR_2DSPHERE { loc: "2dsphere" } ntoreturn:1 ntoskip:0 nscanned:10000 nscannedObjects:10000 keyUpdates:0 writeConflicts:0 numYields:78 nreturned:1 reslen:108 locks:{ Global: { acquireCount: { r: 79 } }, Database: { acquireCount: { r: 79 } }, Collection: { acquireCount: { r: 79 } } } 59ms
      
      • adjust finestIndexedLevel to the finest, which is 30, improves this, but still scan quarter of the collection (2500 nscanned)

      Repro Script

      // generate a grid map with geoJSON format
      function generateGridMapGeoJSON(collection, x1, y1, x2, y2, indexType) {
          var step_x = (x2 - x1) / 100.0;
          var step_y = (y2 - y1) / 100.0;
      
          collection.drop();
          collection.ensureIndex({loc: indexType});
      
          for( var i = x1; i < x2; ) {
              var bulk = collection.initializeUnorderedBulkOp();
      
              for(var j = y1; j < y2; ) {
                  bulk.insert({loc: {type: "Point", coordinates: [i, j]}});
                  j = j + step_y;
              }
              bulk.execute( {w: 1});
              i = i + step_x;
          }
          collection.getDB().getLastError();
      }
      
      // define the area for the map in collection
      var x_min = -0.005;
      var x_max =  0.005;
      var y_min = -0.005;
      var y_max =  0.005;
      
      // define the area to run query from
      // leave 1/7 out on each edge to make sure query are not run out of bound
      var x_query_min = x_min * (6.0/7.0);
      var x_query_max = x_max * (6.0/7.0);
      var y_query_min = y_min * (6.0/7.0);
      var y_query_max = y_max * (6.0/7.0);
      
      // query will run from a 13x13 grid
      var x_query_step = (x_query_max - x_query_min) / 13.0;
      var y_query_step = (y_query_max - y_query_min) / 13.0;
      
      var collection = db.getCollection("geo_near_test")
      generateGridMapGeoJSON(collection,  x_min, y_min, x_max, y_max, "2dsphere");
      
      collection.find({loc: { $near: {$geometry: {type: "Point", coordinates:  [ 0.000329670329670329, 0.0009890109890109888 ] }}}} ).limit(1)
      

      just copy/paste into mongo shell to run it.

            Assignee:
            kevin.albertson@mongodb.com Kevin Albertson
            Reporter:
            rui.zhang Rui Zhang (Inactive)
            Votes:
            1 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: