Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-49274

Coalesce scans of the same index in plans for rooted ORs

    • Type: Icon: Improvement Improvement
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Querying
    • Query Optimization

      A rooted OR query with n branches gets turned into a plan with n index scans, even if each scan is over the same index. For example, consider a collection with an index on a:1,b:1. This query will result in a plan with two index scans:

      db.c.find({$or: [{a:1,b:1}, {a: 1, b: 2}]}) 
      "stage" : "SUBPLAN",
      "inputStage" : {
          "stage" : "FETCH",
          "inputStage" : {
              "stage" : "OR",
              "inputStages" : [
                  {
                      "stage" : "IXSCAN",
                      "keyPattern" : {
                          "a" : 1,
                          "b" : 1
                      },
                      "indexName" : "a_1_b_1",
                      "isMultiKey" : false,
                      "multiKeyPaths" : {
                          "a" : [ ],
                          "b" : [ ]
                      },
                      "isUnique" : false,
                      "isSparse" : false,
                      "isPartial" : false,
                      "indexVersion" : 2,
                      "direction" : "forward",
                      "indexBounds" : {
                          "a" : [
                              "[1.0, 1.0]"
                          ],
                          "b" : [
                              "[1.0, 1.0]"
                          ]
                      }
                  },
                  {
                      "stage" : "IXSCAN",
                      "keyPattern" : {
                          "a" : 1,
                          "b" : 1
                      },
                      "indexName" : "a_1_b_1",
                      "isMultiKey" : false,
                      "multiKeyPaths" : {
                          "a" : [ ],
                          "b" : [ ]
                      },
                      "isUnique" : false,
                      "isSparse" : false,
                      "isPartial" : false,
                      "indexVersion" : 2,
                      "direction" : "forward",
                      "indexBounds" : {
                          "a" : [
                              "[1.0, 1.0]"
                          ],
                          "b" : [
                              "[2.0, 2.0]"
                          ]
                      }
                  }
              ]
          }
      }

      The same logic applies even if there are more than two branches. We know of one instance where a customer was running queries of this form with 20 branches (and was getting 20 IX scans).

      It seems like it would be a strict improvement to use a plan which does a single index scan, instead of one per branch. Today, it's possible to "force" this behavior in some cases by using $in. For example:

      db.c.find({a: 1, b: {$in: [1,2]}}) 

      This query would result in a plan which does a scan over the a:1,b:1 index and uses bounds a:[1,1], b:[1,2].

       

      Using $in is not always possible, however. For example, the below query could not be re-written to use $in because doing so would change its meaning:

      db.c.find({$or: [{a:1,b:1}, {a: 2, b: 2}]}) 

      Using $in:

      db.c.find({a: {$in: [1,2]}, b: {$in: [1,2]}}) 

      The second query would return documents where a is 1 but b is 2, whereas the first query would not.

            Assignee:
            backlog-query-optimization [DO NOT USE] Backlog - Query Optimization
            Reporter:
            ian.boros@mongodb.com Ian Boros
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated: