Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-41142

Compound index on an array field is not being used correctly

    XMLWordPrintable

    Details

      Description

      In the following schema: 

      {
          "arr" : [ 
              "a", 
              "b", 
              "c"
          ],
          "field" : 1,
          "field_2" : 1
      }
      

      with the following index: 

      {
          "arr" : 1,
          "field" : 1,
          "field_2" : 1
      }
      

      If I use this query:

      db.coll
          .find({
              $or: [
                  {arr: 'a'},
                  {arr: []}
              ]
          })
          .sort({field: 1})
          .explain()
      

      The inputStages are separated to 3 stages - arr: [a, a], arr: [[], []], arr: [undefined, undefined].
      This is a good scenario since these input stages are followed by a "SORT_MERGE" stage

      BUT, if I use the following query:

      db.coll
          .find({
              $or: [
                  {arr: 'a'},
                  {arr: []}
              ],
              field: 1
          })
          .sort({field_2: 1})
          .explain()
      

      There are only 2 input stages - arr: [a, a], arr: [[undefined, undefined], [[] , []]] .
      This results that an additional stage needs to happen in order to FETCH the empty array docs and then it cannot use the SORT_MERGE stage.

      In large collection, this causes the following error:
      "Sort operation used more than the maximum 33554432 bytes of RAM. Add an index, or specify a smaller limit
      Even though there is an index for this query.

      I would expect the second scenario to perform like the first one - separating the input stages to 3 stages and to use the SORT_MERGE function.

        Attachments

        1. bad_scenario.js
          5 kB
        2. good_scenario.js
          6 kB

          Issue Links

            Activity

              People

              Assignee:
              backlog-query-optimization Backlog - Query Optimization
              Reporter:
              tomgrossman Tom Grossman
              Participants:
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

                Dates

                Created:
                Updated: