Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-31898

Improve sort analysis to allow multikey indexes to provide sorts when possible

    • Type: Icon: Improvement Improvement
    • Resolution: Fixed
    • Priority: Icon: Minor - P4 Minor - P4
    • 4.3.4
    • Affects Version/s: None
    • Component/s: Querying
    • Labels:
      None
    • Fully Compatible
    • Query 2019-12-16, Query 2019-12-30, Query 2020-01-13, Query 2020-01-27, Query 2020-02-10

      Due to the changes in SERVER-19402, it is no longer correct for a multikey index scan to provide a non-blocking sort. However, there are some special cases it which it is correct to do so. Specifically, it is correct when:

      1. The index bounds for all fields of the sort pattern are [MinKey, MaxKey].
      2. There are no bounds over a multikey field which shares a path prefix with the sort pattern.

      This ticket tracks the work to improve the query planner so that it can generate non-blocking sorts with multikey indexes when possible.

      To explain constraint #1, consider the case when you have index {myArray: 1} and you issue the following query:

      > db.c.find({myArray: {$gte: 5}}).sort({myArray: 1})
      { "_id" : ObjectId("5a04af3cb778d4774430038b"), "myArray" : [ 1, 6, 7 ] }
      { "_id" : ObjectId("5a04af35b778d4774430038a"), "myArray" : [ 3, 4, 5 ] }
      

      If we were to use an index scan with bounds [5, Infinity] to answer this query, then these documents would sort incorrectly---the second document would use 5 as its sort key and the first document would use 6.

      Constraint #2 is a bit more subtle. Suppose we have an index {"a.b": 1, "a.c": 1} and the query

      > db.c.find({"a.b": 0}).sort({"a.c": 1})
      { "_id" : 1, "a" : [ { "b" : 0, "c" : 1 }, { "b" : 1, "c" : -1 } ] }
      { "_id" : 0, "a" : [ { "b" : 1, "c" : 1 }, { "b" : 0, "c" : 0 } ] }
      

      The correct sort order is _id:1 and then _id:0, because c:-1 is smaller then c:0. However, if we were to use a multikey index scan with bounds on "a.b", we would end up using c:0 as a sort key for the document with _id:0 and c:1 as a sort key for the document with _id:1. This would sort the documents in the incorrect order. The problem is that the bounds on "a.b" constrain which keys along the path "a.c" we scan, even though the index bounds for "a.c" are [MinKey, MaxKey].

            Assignee:
            ian.boros@mongodb.com Ian Boros
            Reporter:
            david.storch@mongodb.com David Storch
            Votes:
            2 Vote for this issue
            Watchers:
            19 Start watching this issue

              Created:
              Updated:
              Resolved: