Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-13754

Bounds should not be combined for $or queries that can use merge sort

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • 2.6.2, 2.7.1
    • Affects Version/s: 2.6.0
    • Component/s: Querying
    • None
    • ALL

      Issue Status as of May 15, 2014

      ISSUE SUMMARY
      Sometimes, sorted $or queries that require multiple index scans over the same index incorrectly perform an in-memory sort, instead of a merge sort. An in-memory sort has higher memory requirements than merge sort, and may require scanning more documents or index keys than a merge sort. This means that the query planner should prefer merge sort solutions to in-memory sort solutions where possible.

      USER IMPACT
      Users may experience low performance on these type of queries.

      WORKAROUNDS
      There are no known workarounds. Users experiencing severe performance degradation can consider downgrading to the 2.4.10 production release until a fix is released.

      AFFECTED VERSIONS
      Production versions 2.6.0 and 2.6.1 are affected by this issue.

      FIX VERSION
      The fix is included in the 2.6.2 production release.

      RESOLUTION DETAILS
      The QueryPlannerAnalysis phase joins the results of multiple index scans for affected $or queries with a mergeSort when the sort order is shared.

      Original description.

      It seems that at current version (2.6.1), the query optimizer is still unable to optimize queries that use $or, $in, limit() and sort() all at once. The https://jira.mongodb.org/browse/SERVER-1205 and https://jira.mongodb.org/browse/SERVER-3310 fixes, each only improved performance on queries having 3 out of the 4 operations listed above. When introducing a 4th operation into the query, the optimization goes out the window. This behavior is observed with full index and document scans within the $or clause, even though a limit() is specified.

      Suppose I have an application that needs to find the 10 latest newsfeeds that are either:
      a) Posted by me,
      b) Posted by my friends that have permission types of 'public', or 'friend',
      c) Designated to a specified group of users which included me.

      Suppose the indexes on this collection are:

      {owner:1, create_time: -1, permission: 1}

      Compound Index

      {designate:1, create_time: -1}

      Compound Multikey Index where 'designate' is an array.

      The following query is not optimized, and scans way more documents then it should:

      db.stream.find(
          {$or: [
              {owner: my_id },
              {owner: {$in: [lots_of_friend_ids]}, permission: {$in: ['public', 'friend']}},
              {designate: my_id}
          ]}
      ).limit(10).sort({create_time: -1})
      

            Assignee:
            david.storch@mongodb.com David Storch
            Reporter:
            xres74 Edward
            Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

              Created:
              Updated:
              Resolved: