Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-15168

Allow intersection of $min/$max bounds with computed query indexBounds

    • Type: Icon: Improvement Improvement
    • Resolution: Duplicate
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Querying
    • None

      Currently, when $min/$max are used to set index bounds, any index bounds that have been computed from the query predicate are ignored/overridden. There are some use cases (for example, using $min to perform pagination) where it would be useful to respect both the supplied $min/$max index bounds, as well as any bounds computed from the query predicate. Since the query results are always limited by the query predicate, it should always be safe to do this intersection.

      Currently, using $min alone (eg. to paginate by picking up in the index where the previous query left off) sets $max to MaxKey for all fields. This means that the computed upper index bound is ignored, and can cause the rest of the index to be (needlessly) scanned.

      For example, consider the following query (with index { b: 1, e: 1, f: 1, _id: 1 }):

      > db.test.find({a: "foo", b: { $gte: 1, $lte: 3 }, c: 1, d: 2014, e: true, f: 1, _id: { $ne: 105 }}).explain()
      {
              "cursor" : "BtreeCursor b_1_e_1_f_1__id_1",
              "isMultiKey" : false,
              "n" : 2,
              "nscannedObjects" : 944,
              "nscanned" : 945,
              "nscannedObjectsAllPlans" : 1889,
              "nscannedAllPlans" : 1892,
              "scanAndOrder" : false,
              "indexOnly" : false,
              "nYields" : 14,
              "nChunkSkips" : 0,
              "millis" : 6,
              "indexBounds" : {
                      "b" : [
                              [
                                      1,
                                      3
                              ]
                      ],
                      "e" : [
                              [
                                      true,
                                      true
                              ]
                      ],
                      "f" : [
                              [
                                      1,
                                      1
                              ]
                      ],
                      "_id" : [
                              [
                                      {
                                              "$minElement" : 1
                                      },
                                      105
                              ],
                              [
                                      105,
                                      {
                                              "$maxElement" : 1
                                      }
                              ]
                      ]
              },
              "server" : "genique:24802",
              "filterSet" : false,
              "stats" : {
                      "type" : "KEEP_MUTATIONS",
                      "works" : 948,
                      "yields" : 14,
                      "unyields" : 14,
                      "invalidates" : 0,
                      "advanced" : 2,
                      "needTime" : 944,
                      "needFetch" : 0,
                      "isEOF" : 1,
                      "children" : [
                              {
                                      "type" : "FETCH",
                                      "works" : 947,
                                      "yields" : 14,
                                      "unyields" : 14,
                                      "invalidates" : 0,
                                      "advanced" : 2,
                                      "needTime" : 944,
                                      "needFetch" : 0,
                                      "isEOF" : 1,
                                      "alreadyHasObj" : 0,
                                      "forcedFetches" : 0,
                                      "matchTested" : 2,
                                      "children" : [
                                              {
                                                      "type" : "IXSCAN",
                                                      "works" : 946,
                                                      "yields" : 14,
                                                      "unyields" : 14,
                                                      "invalidates" : 0,
                                                      "advanced" : 944,
                                                      "needTime" : 2,
                                                      "needFetch" : 0,
                                                      "isEOF" : 1,
                                                      "keyPattern" : "{ b: 1.0, e: 1.0, f: 1.0, _id: 1.0 }",
                                                      "isMultiKey" : 0,
                                                      "boundsVerbose" : "field #0['b']: [1.0, 3.0], field #1['e']: [true, true], field #2['f']: [1.0, 1.0], field #3['_id']: [MinKey, 105.0), (105.0, MaxKey]",
                                                      "yieldMovedCursor" : 0,
                                                      "dupsTested" : 0,
                                                      "dupsDropped" : 0,
                                                      "seenInvalidated" : 0,
                                                      "matchTested" : 0,
                                                      "keysExamined" : 945,
                                                      "children" : [ ]
                                              }
                                      ]
                              }
                      ]
              }
      }
      

      If $min is added (to indicate the first index entry that should be considered (by way of _id)), then the lack of an upper bound causes the rest of the index to be scanned, and the query becomes inefficient:

      > db.test.find({a: "foo", b: { $gte: 1, $lte: 3 }, c: 1, d: 2014, e: true, f: 1, _id: { $ne: 105 }}).min({b:1,e:true,f:1,_id:0}).explain()
      {
              "cursor" : "BtreeCursor b_1_e_1_f_1__id_1",
              "isMultiKey" : false,
              "n" : 2,
              "nscannedObjects" : 31500,
              "nscanned" : 1,
              "nscannedObjectsAllPlans" : 31500,
              "nscannedAllPlans" : 1,
              "scanAndOrder" : false,
              "indexOnly" : false,
              "nYields" : 246,
              "nChunkSkips" : 0,
              "millis" : 36,
              "indexBounds" : {
      
              },
              "server" : "genique:24802",
              "filterSet" : false,
              "stats" : {
                      "type" : "KEEP_MUTATIONS",
                      "works" : 31502,
                      "yields" : 246,
                      "unyields" : 246,
                      "invalidates" : 0,
                      "advanced" : 2,
                      "needTime" : 31499,
                      "needFetch" : 0,
                      "isEOF" : 1,
                      "children" : [
                              {
                                      "type" : "FETCH",
                                      "works" : 31502,
                                      "yields" : 246,
                                      "unyields" : 246,
                                      "invalidates" : 0,
                                      "advanced" : 2,
                                      "needTime" : 31499,
                                      "needFetch" : 0,
                                      "isEOF" : 1,
                                      "alreadyHasObj" : 0,
                                      "forcedFetches" : 0,
                                      "matchTested" : 2,
                                      "children" : [
                                              {
                                                      "type" : "IXSCAN",
                                                      "works" : 31501,
                                                      "yields" : 246,
                                                      "unyields" : 246,
                                                      "invalidates" : 0,
                                                      "advanced" : 31500,
                                                      "needTime" : 1,
                                                      "needFetch" : 0,
                                                      "isEOF" : 1,
                                                      "keyPattern" : "{ b: 1.0, e: 1.0, f: 1.0, _id: 1.0 }",
                                                      "isMultiKey" : 0,
                                                      "boundsVerbose" : "[{ : 1.0, : true, : 1.0, : 0.0 }, ]",
                                                      "yieldMovedCursor" : 0,
                                                      "dupsTested" : 0,
                                                      "dupsDropped" : 0,
                                                      "seenInvalidated" : 0,
                                                      "matchTested" : 0,
                                                      "keysExamined" : 1,
                                                      "children" : [ ]
                                              }
                                      ]
                              }
                      ]
              }
      }
      

      However, there is no need to scan the rest of the index - the computed index bounds say that the query can stop after b: 3 has been scanned.

      Adding a suitable $max value allows the query to be performant again:

      > db.test.find({a: "foo", b: { $gte: 1, $lte: 3 }, c: 1, d: 2014, e: true, f: 1, _id: { $ne: 105 }}).min({b:1,e:true,f:1,_id:0}).max({b:3,e:true,f:1,_id:100000}).explain()
      {
              "cursor" : "BtreeCursor b_1_e_1_f_1__id_1",
              "isMultiKey" : false,
              "n" : 2,
              "nscannedObjects" : 945,
              "nscanned" : 946,
              "nscannedObjectsAllPlans" : 945,
              "nscannedAllPlans" : 946,
              "scanAndOrder" : false,
              "indexOnly" : false,
              "nYields" : 7,
              "nChunkSkips" : 0,
              "millis" : 3,
              "indexBounds" : {
      
              },
              "server" : "genique:24802",
              "filterSet" : false,
              "stats" : {
                      "type" : "KEEP_MUTATIONS",
                      "works" : 947,
                      "yields" : 7,
                      "unyields" : 7,
                      "invalidates" : 0,
                      "advanced" : 2,
                      "needTime" : 944,
                      "needFetch" : 0,
                      "isEOF" : 1,
                      "children" : [
                              {
                                      "type" : "FETCH",
                                      "works" : 947,
                                      "yields" : 7,
                                      "unyields" : 7,
                                      "invalidates" : 0,
                                      "advanced" : 2,
                                      "needTime" : 944,
                                      "needFetch" : 0,
                                      "isEOF" : 1,
                                      "alreadyHasObj" : 0,
                                      "forcedFetches" : 0,
                                      "matchTested" : 2,
                                      "children" : [
                                              {
                                                      "type" : "IXSCAN",
                                                      "works" : 946,
                                                      "yields" : 7,
                                                      "unyields" : 7,
                                                      "invalidates" : 0,
                                                      "advanced" : 945,
                                                      "needTime" : 1,
                                                      "needFetch" : 0,
                                                      "isEOF" : 1,
                                                      "keyPattern" : "{ b: 1.0, e: 1.0, f: 1.0, _id: 1.0 }",
                                                      "isMultiKey" : 0,
                                                      "boundsVerbose" : "[{ : 1.0, : true, : 1.0, : 0.0 }, { : 3.0, : true, : 1.0, : 100000.0 })",
                                                      "yieldMovedCursor" : 0,
                                                      "dupsTested" : 0,
                                                      "dupsDropped" : 0,
                                                      "seenInvalidated" : 0,
                                                      "matchTested" : 0,
                                                      "keysExamined" : 946,
                                                      "children" : [ ]
                                              }
                                      ]
                              }
                      ]
              }
      }
      

      This $max value must be manually computed by the client application, which is burdensome and essentially duplicates the work of the server when it computes the index bounds of the query predicate.

      The only workaround at present is to compute and specify such a $max bound.

      A good first step might be: when $min is specified but not $max, set the upper bound as the combination of upper bounds of each of the fields from the computed index bounds (instead of just MaxKey for each of the fields), and vice-versa for $max but no $min. Further refinement might include using any gaps in the per-field bounds to skip sections of the index, if this can be determined to be possible/safe.

            Assignee:
            Unassigned Unassigned
            Reporter:
            kevin.pulo@mongodb.com Kevin Pulo
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated:
              Resolved: