Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-13732

Predicates in top-level implicit AND query not considered when generating index access plan for contained OR

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical - P2
    • Resolution: Fixed
    • Affects Version/s: 2.6.0
    • Fix Version/s: 3.5.4
    • Component/s: Querying
    • Labels:
    • Backwards Compatibility:
      Fully Compatible
    • Operating System:
      ALL
    • Sprint:
      Quint Iteration 3, Quint Iteration 4, Quint Iteration 5, Quint Iteration 6, Query 2017-01-23, Query 2017-02-13, Query 2017-03-06

      Description

      Issue Status as of Feb 28, 2017

      ISSUE SUMMARY

      A "contained $or" is a query predicate which has an explicit or implicit $and at the top level where one of the clauses of the $and is a $or. For instance, the following two predicates are contained $or queries:

      {$and: [{a: 1}, {$or: [{b: 1}, {b: 2}]}, {c: 3}]}
      {a: 1, $or: [{b: 1}, {b: 2}], c: 3}
      

      This issue affects contained $or queries. If the query is answered by indexing the contained $or, versions 2.4.x and earlier of the server could also use the predicates outside the $or to constrain the index bounds. Suppose that a collection has two compound indexes {b: 1, a: 1} and {c: 1, a: 1}. Now consider a query with predicate

      {a: {$gte: 3, $lte: 6}, $or: [{b: 1}, {c: 2}]}
      

      In versions 2.4.x and earlier, this query is answered by two index scans:

      1. First scanning index {b: 1, a: 1} for keys where b is equal to 1 and a is on the interval [3, 6], and then
      2. scanning index {c: 1, a: 1} for keys where c is equal to 2 and a is on the interval [3, 6].

      However, on versions of the server affected by this issue, both index scans will have unconstrained bounds on field a and may examine keys where the value of a is outside the interval [3, 6].

      Versions 2.4.x and earlier of the server could also use the predicates outside the $or to provide index bounds for the first field in the index. Consider the same query on a collection with two compound indexes {a: 1, b: 1} and {a: 1, c: 1}. In versions 2.4.x and earlier, this query could be answered by two index scans:

      1. First scanning index {a: 1, b: 1} for keys where a is on the interval [3, 6] and b is equal to 1, and then
      2. scanning index {a: 1, c: 1} for keys where a is on the interval [3, 6] and c is equal to 2.

      On versions of the server affected by this issue, this query will always be answered by a single index scan on either {a: 1, b: 1} or {a: 1, c: 1} for keys where a is on the interval [3, 6].

      USER IMPACT

      Contained $or queries indexed in the manner described above will have looser bounds than they did on previous versions. Depending on the data in the collection, this could mean that the number of index keys examined in order to answer the query will be much higher, causing bad query performance.

      WORKAROUNDS

      Applications can work around the issue by rewriting contained $or queries as rooted $or queries. A "rooted $or" is a query predicate consisting of a single $or statement. For example, the following query predicate is a rooted $or:

      {$or: [{a: 1, b: 1}, {a: 2, b: 2}]}
      

      This rewrite involves distributing any predicates outside the contained $or into each $or clause. For example:

      {a: 1, b: 1, $or: [{c: 1}, {d: 1}]}
      

      could become the following equivalent query:

      {$or: [{a: 1, b:1, c: 1}, {a: 1, b: 1, d: 1}]}
      

      AFFECTED VERSIONS

      MongoDB 2.6.0 through 3.5.3

      FIX VERSION
      The fix is included in the 3.5.4 development release.

      Original description

      Certain types of queries containing $or clauses no longer make efficient use of indexes. This is a regression from 2.4 => 2.6.

      Given a collection with indexes {a:1, b:1} and {a:1, c:1}, consider the following query:

      db.collection.find({a:1, $or: [{b:1}, {c:1}]})
      

      The 2.4 query planner considers a union of an scan on index {a:1, b:1} with a scan on index {a:1, c:1}.

      The 2.6 query planner does not consider this plan; it selects a plan that scans the whole {a:1} range for one of the two indexes.

      There is an available workaround for this issue, which is to rewrite the query as a rooted $or (e.g. {$or: [{a:1, b:1}, {a:1, c:1}]}). For these queries, the 2.6 query planner correctly considers a plan that performs a union of the separate index scans.

        Attachments

          Issue Links

            Activity

              People

              • Votes:
                18 Vote for this issue
                Watchers:
                49 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: