[SERVER-23065] Geo predicate beneath $elemMatch object causes planner to ignore valid index with 2dsphereIndexVersion > 1 Created: 10/Mar/16  Updated: 05/May/16  Resolved: 21/Apr/16

Status: Closed
Project: Core Server
Component/s: Index Maintenance, Querying
Affects Version/s: 3.0.10, 3.2.4
Fix Version/s: 3.3.5

Type: Bug Priority: Critical - P2
Reporter: Bernard Gorman Assignee: David Storch
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Text File explain_2dsphereIndexVersion_1.txt     Text File explain_2dsphereIndexVersion_2.txt     Text File explain_2dsphereIndexVersion_3.txt     File indexTest.geo.min.js    
Issue Links:
Duplicate
is duplicated by SERVER-22205 Index bounds are not populated for a ... Closed
Related
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Query 13 (04/22/16)
Participants:

 Description   

Issue Summary

With a schema and index such as the following:

Schema

{
  b : 1,
  d : [{
      e: 3,
      f: {
          type : "Point",
          coordinates:[-2, 53]
      }
    }
  ]
}

Index

{
  "b" : 1,
  "d.e" : 1,
  "d.f": "2dsphere"
}

A query on these fields using elemMatch on d results in a COLLSCAN for 2dsphereIndexVersion 2 and 3 on both 3.0.10 and 3.2.4. Keeping everything else identical and switching to 2dsphereIndexVersion 1 results in the planner correctly using the index with the expected bounds.

Reproduction

Run the following script and verify that the explain output is a COLLSCAN. Then, change the 2dsphereIndexVersion to 1 and re-run.

indexTest.geo.min.js

db.indextest.drop();
 
var row = {
  b : 1,
  d : [{
      e: 3,
      f: {
          type : "Point",
          coordinates:[-2, 53]
      }
    }
  ]
};
 
db.indextest.createIndex({
  "b" : 1,
  "d.e" : 1,
  "d.f": "2dsphere"
},{
    "2dsphereIndexVersion": 2
});
 
db.indextest.save(row);
 
var query = {
  b : 1,
  d : {
    $elemMatch : {
      e : 3,
      f:{
         $geoWithin : {
           $centerSphere : [[-2, 53], 2.523219554550177E-4]
         }
       }
    }
  }
};
 
var explain = db.indextest.find(query).explain();
var count = db.indextest.find(query).count();
 
printjson(explain);
printjson( { "count" : count } )

Notes

  • Hinting the index with 2dsphereIndexVersion 2 or 3 results in the index bounds being set to [MinKey,MaxKey]

Attached

  • Repro script
  • explain output for each 2dsphereIndexVersion, all on 3.2.4


 Comments   
Comment by Githook User [ 21/Apr/16 ]

Author:

{u'username': u'dstorch', u'name': u'David Storch', u'email': u'david.storch@10gen.com'}

Message: SERVER-23065 don't strip assignments to 2dsphere indices with geo predicate below elemMatch object
Branch: master
https://github.com/mongodb/mongo/commit/f6d17a463ef96f9623037fc9690cb04941085148

Comment by David Storch [ 18/Apr/16 ]

Simplified repro script:

(function() {
    'use strict';
 
    db.c.drop();
    db.c.ensureIndex({a: 1, 'b.c': '2dsphere'});
 
    var explain = db.c.explain().find({
      a : 1,
      b : {
        $elemMatch : {c : {$geoWithin : {$centerSphere : [ [ 0, 0 ], 1 ]}}}
      }
    }).finish();
 
    printjson(explain);
})();

Comment by David Storch [ 23/Mar/16 ]

We are still investigating the complexity of a fix, but I would like to provide some information regarding the diagnosis of this issue.

The index selection phase of query planning involves a process of matching predicates to indices. QueryPlannerIXSelect contains a suite of functions which tag predicates with the indices that they can use. For example, the predicate "field a is less than 3" is tagged to indicate that it can use the second position of the compound index {b: 1, a: 1}.

As part of this process, we must enforce the "geo-sparseness" property of 2dsphere indices with 2dsphereIndexVersion 2 or 3. The geo-sparseness property is documented here and was introduced in MongoDB 2.6 as part of 2dsphereIndexVersion 2. The sparseness property means that only documents which store geometry in the 2dsphere-indexed path generate index keys which are inserted into the index. For example, if you have index {a: 1, b: "2dsphere"}, only documents containing geometry inside field b get indexed.

The consequence of this sparseness property is that, for index {a: 1, b: "2dsphere"}, a query on a but not b cannot use the index since this will miss any documents that have no b field. This is enforced via a function that removes assignments to 2dsphere indices that are logically incorrect: QueryPlannerIXSelect::stripInvalidAssignmentsTo2dsphereIndices().

For the problem query reported in this ticket, assignments to the 2dsphere index are stripped when they should not be. This happens because the code expects the predicate over b and the predicate over the 2dsphere field d.f to be at the same level in the abstract syntax tree. Since the d.f geo predicate is beneath an $elemMatch and not directly joined to b by an AND node, we mistakenly remove the assignment of b to the index. The end result is a COLLSCAN plan.

Generated at Thu Feb 08 04:02:16 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.