Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-63132

Allow indexed field to provide sort key when index has non-simple collation

    • Type: Icon: Improvement Improvement
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 7.1.0-rc0
    • Affects Version/s: None
    • Component/s: None
    • Labels:
    • Query Optimization
    • Fully Compatible
    • QO 2023-05-15, QO 2023-05-29, QO 2023-06-12, QO 2023-06-26, QO 2023-07-10

      In the following example, a query is run against a collection with a non-simple collation. The index created and find command both inherit the collection collation.

      While 'z' is indexed, as there is no filter or sort on 'y', it cannot be used to provide indexed sort order. The indexed value for 'z' should however be usable as a sort key in a blocking sort. This would allow the find below to examine 10 index keys, sort on 'z', and then fetch 5 documents for first 5 sorted keys. Instead we construct a query plan where we fetch the document for each index key examined, fetching 10 documents instead of 5.

      db.test.drop();
      db.createCollection("test", {collation: {locale: "en", strength: 2}});
      db.test.createIndex({x: 1, y: 1, z: 1});
      for (var i = 0; i < 10; ++i) {for (var i = 0; i < 10; ++i) {db.test.insert({x: 1, y: 1, z: i});}}
      var explain = db.test.explain("executionStats").find({x: 1}).sort({z: 1}).limit(5).next();
      // Ideally we would fetch only 5 documents. This assert fails as we instead fetch 10.
      assert.eq(5, explain.executionStats.totalDocsExamined, explain);
      

      The starting point for this behaviorĀ is here in our code. We call a method on the index scan node that asks whether a given field isĀ fully provided for. 'Fully provided' for an index with a collation means the index bounds generated by the index scan node cannot contain string values and we can therefore materialize the original value from the index key. For this query, we are scanning the sort field 'z' from MinKey to MaxKey, so it can contain strings. Given that, we add a fetch stage below the sort. I believe this logic is too restrictive for what we need in the sort stage. The collated value of 'z' should be sufficient for sorting purposes.

        1. newplan.txt
          8 kB
        2. oldplan.txt
          7 kB
        3. plan_different_collations.txt
          6 kB

            Assignee:
            peter.volk@mongodb.com Peter Volk
            Reporter:
            james.wahlin@mongodb.com James Wahlin
            Votes:
            0 Vote for this issue
            Watchers:
            12 Start watching this issue

              Created:
              Updated:
              Resolved: