[SERVER-31898] Improve sort analysis to allow multikey indexes to provide sorts when possible Created: 09/Nov/17  Updated: 30/Oct/23  Resolved: 28/Jan/20

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: None
Fix Version/s: 4.3.4

Type: Improvement Priority: Minor - P4
Reporter: David Storch Assignee: Ian Boros
Resolution: Fixed Votes: 2
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
is duplicated by SERVER-40599 Consider scanning multikey indices fo... Closed
Related
is related to SERVER-33387 Multikey index does not provide nonbl... Closed
is related to SERVER-45823 Allow multikey index scans to be expl... Backlog
Backwards Compatibility: Fully Compatible
Sprint: Query 2019-12-16, Query 2019-12-30, Query 2020-01-13, Query 2020-01-27, Query 2020-02-10
Participants:

 Description   

Due to the changes in SERVER-19402, it is no longer correct for a multikey index scan to provide a non-blocking sort. However, there are some special cases it which it is correct to do so. Specifically, it is correct when:

  1. The index bounds for all fields of the sort pattern are [MinKey, MaxKey].
  2. There are no bounds over a multikey field which shares a path prefix with the sort pattern.

This ticket tracks the work to improve the query planner so that it can generate non-blocking sorts with multikey indexes when possible.

To explain constraint #1, consider the case when you have index {myArray: 1} and you issue the following query:

> db.c.find({myArray: {$gte: 5}}).sort({myArray: 1})
{ "_id" : ObjectId("5a04af3cb778d4774430038b"), "myArray" : [ 1, 6, 7 ] }
{ "_id" : ObjectId("5a04af35b778d4774430038a"), "myArray" : [ 3, 4, 5 ] }

If we were to use an index scan with bounds [5, Infinity] to answer this query, then these documents would sort incorrectly---the second document would use 5 as its sort key and the first document would use 6.

Constraint #2 is a bit more subtle. Suppose we have an index {"a.b": 1, "a.c": 1} and the query

> db.c.find({"a.b": 0}).sort({"a.c": 1})
{ "_id" : 1, "a" : [ { "b" : 0, "c" : 1 }, { "b" : 1, "c" : -1 } ] }
{ "_id" : 0, "a" : [ { "b" : 1, "c" : 1 }, { "b" : 0, "c" : 0 } ] }

The correct sort order is _id:1 and then _id:0, because c:-1 is smaller then c:0. However, if we were to use a multikey index scan with bounds on "a.b", we would end up using c:0 as a sort key for the document with _id:0 and c:1 as a sort key for the document with _id:1. This would sort the documents in the incorrect order. The problem is that the bounds on "a.b" constrain which keys along the path "a.c" we scan, even though the index bounds for "a.c" are [MinKey, MaxKey].



 Comments   
Comment by Githook User [ 28/Jan/20 ]

Author:

{'email': 'ian.boros@mongodb.com', 'username': 'puppyofkosh', 'name': 'Ian Boros'}

Message: SERVER-31898 Allow multikey indexes to provide sorts in certain cases
Branch: master
https://github.com/mongodb/mongo/commit/ccc13ef0b7ce0f908479522660a54dcbd142f7a1

Generated at Thu Feb 08 04:28:32 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.