[SERVER-23533] Multikey 2dsphere indexes can support compounding bounds in more cases due to keys generated Created: 05/Apr/16  Updated: 27/Dec/23

Status: Backlog
Project: Core Server
Component/s: Querying
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Max Hirschhorn Assignee: Backlog - Query Integration
Resolution: Unresolved Votes: 0
Labels: qi-geo
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to SERVER-61074 Investigate refactoring compound-inde... Backlog
Assigned Teams:
Query Integration
Participants:

 Description   

Note: This improvement is separate from any improvements to intersecting and/or compounding bounds with multikey 2dsphere indexes (SERVER-23112) because it doesn't rely on having access to path-level multikey information.

The "parallel arrays restriction" doesn't apply to 2dsphere indexes. Additionally, the key generation code for 2dsphere indexes produces keys by taking the Cartesian product of all distinct values (after expanding arrays) for each field.

// We take the Cartesian product of all of the keys.  This requires that we have
// some keys to take the Cartesian product with.  If keysToAdd.empty(), we
// initialize it.
if (keysToAdd.empty()) {
    keysToAdd = keysForThisField;
    continue;
}
 
BSONObjSet updatedKeysToAdd;
for (BSONObjSet::const_iterator it = keysToAdd.begin(); it != keysToAdd.end(); ++it) {
    for (BSONObjSet::const_iterator newIt = keysForThisField.begin();
         newIt != keysForThisField.end();
         ++newIt) {
        BSONObjBuilder b;
        b.appendElements(*it);
        b.append(newIt->firstElement());
        updatedKeysToAdd.insert(b.obj());
    }
}
keysToAdd = updatedKeysToAdd;

For example, the index {'a.nongeo': 1, 'a.geo': '2dsphere'} produces 3 index keys for the document {a: [{nongeo: [1, 2, 3]}, {geo: [0, 0]}]}

  • {"": 1, "": S2Key([0, 0])}
  • {"": 2, "": S2Key([0, 0])}
  • {"": 3, "": S2Key([0, 0])}

despite the geo-coordinates being specified in a different array element than the non-geo data. Similarly, the index {'a.nongeo1': 1, 'a.nongeo2': 1, 'a.geo': '2dsphere'} produces 9 index keys for the document {a: [{nongeo1: [1, 2, 3]}, {nongeo2: [4, 5, 6]}, {geo: [0, 0]}]}

  • {"": 1, "": 4, "": S2Key([0, 0])}
  • {"": 1, "": 5, "": S2Key([0, 0])}
  • {"": 1, "": 6, "": S2Key([0, 0])}
  • {"": 2, "": 4, "": S2Key([0, 0])}
  • {"": 2, "": 5, "": S2Key([0, 0])}
  • {"": 2, "": 6, "": S2Key([0, 0])}
  • {"": 3, "": 4, "": S2Key([0, 0])}
  • {"": 3, "": 5, "": S2Key([0, 0])}
  • {"": 3, "": 6, "": S2Key([0, 0])}

Since we produce combinations of index keys for elements that aren't necessarily part of the same array, the query {'a.nongeo1': 1, 'a.nongeo2': 4, 'a.geo': {$nearSphere: {$geometry: [0, 0]}}} should be able to use the 2dsphere index and compound the bounds, i.e.

"indexBounds" : {
  "a.nongeo1" : [
    "[1.0, 1.0]"
  ],
  "a.nongeo2" : [
    "[4.0, 4.0]"  // Instead of "[MinKey, MaxKey]"
  ],
  "a.geo" : [
    ...
  ]
}

because the index key {"": 1, "": 4, "": S2Key([0, 0])} exists. We can therefore assign additional predicates that occur on distinct paths, even if a shared prefix of the paths causes the index to be multikey.



 Comments   
Comment by Max Hirschhorn [ 05/Apr/16 ]

jason.rassi pointed out that the query needs to have a geo-predicate on the "geo" field in order return the full set of results while respecting the geo-sparseness rules for 2dsphereIndexVersion >= 2. I've updated the description of this ticket accordingly.

Generated at Thu Feb 08 04:03:40 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.