Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-23533

Multikey 2dsphere indexes can support compounding bounds in more cases due to keys generated

    • Type: Icon: Improvement Improvement
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Querying
    • Labels:
    • Query Integration

      Note: This improvement is separate from any improvements to intersecting and/or compounding bounds with multikey 2dsphere indexes (SERVER-23112) because it doesn't rely on having access to path-level multikey information.

      The "parallel arrays restriction" doesn't apply to 2dsphere indexes. Additionally, the key generation code for 2dsphere indexes produces keys by taking the Cartesian product of all distinct values (after expanding arrays) for each field.

      // We take the Cartesian product of all of the keys.  This requires that we have
      // some keys to take the Cartesian product with.  If keysToAdd.empty(), we
      // initialize it.
      >if (keysToAdd.empty()) {
          keysToAdd = keysForThisField;
          continue;
      }
      
      BSONObjSet updatedKeysToAdd;
      for (BSONObjSet::const_iterator it = keysToAdd.begin(); it != keysToAdd.end(); ++it) {
          for (BSONObjSet::const_iterator newIt = keysForThisField.begin();
               newIt != keysForThisField.end();
               ++newIt) {
              BSONObjBuilder b;
              b.appendElements(*it);
              b.append(newIt->firstElement());
              updatedKeysToAdd.insert(b.obj());
          }
      }
      keysToAdd = updatedKeysToAdd;
      

      For example, the index {'a.nongeo': 1, 'a.geo': '2dsphere'} produces 3 index keys for the document {a: [{nongeo: [1, 2, 3]}, {geo: [0, 0]}]}

      • {"": 1, "": S2Key([0, 0])}
      • {"": 2, "": S2Key([0, 0])}
      • {"": 3, "": S2Key([0, 0])}

      despite the geo-coordinates being specified in a different array element than the non-geo data. Similarly, the index {'a.nongeo1': 1, 'a.nongeo2': 1, 'a.geo': '2dsphere'} produces 9 index keys for the document {a: [{nongeo1: [1, 2, 3]}, {nongeo2: [4, 5, 6]}, {geo: [0, 0]}]}

      • {"": 1, "": 4, "": S2Key([0, 0])}
      • {"": 1, "": 5, "": S2Key([0, 0])}
      • {"": 1, "": 6, "": S2Key([0, 0])}
      • {"": 2, "": 4, "": S2Key([0, 0])}
      • {"": 2, "": 5, "": S2Key([0, 0])}
      • {"": 2, "": 6, "": S2Key([0, 0])}
      • {"": 3, "": 4, "": S2Key([0, 0])}
      • {"": 3, "": 5, "": S2Key([0, 0])}
      • {"": 3, "": 6, "": S2Key([0, 0])}

      Since we produce combinations of index keys for elements that aren't necessarily part of the same array, the query {'a.nongeo1': 1, 'a.nongeo2': 4, 'a.geo': {$nearSphere: {$geometry: [0, 0]}}} should be able to use the 2dsphere index and compound the bounds, i.e.

      "indexBounds" : {
        "a.nongeo1" : [
          "[1.0, 1.0]"
        ],
        "a.nongeo2" : [
          "[4.0, 4.0]"  // Instead of "[MinKey, MaxKey]"
        ],
        "a.geo" : [
          ...
        ]
      }
      

      because the index key {"": 1, "": 4, "": S2Key([0, 0])} exists. We can therefore assign additional predicates that occur on distinct paths, even if a shared prefix of the paths causes the index to be multikey.

            Assignee:
            backlog-query-integration [DO NOT USE] Backlog - Query Integration
            Reporter:
            max.hirschhorn@mongodb.com Max Hirschhorn
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated: