[SERVER-15636] Query Optimizer Index Scans when first predicate is unbounded. Created: 13/Oct/14  Updated: 19/Oct/21  Resolved: 19/Oct/21

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: 2.6.0
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Osmar Olivo Assignee: James Wahlin
Resolution: Duplicate Votes: 1
Labels: 2426
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates SERVER-13197 Tighten index bounds and allow compou... Backlog
Participants:
Case:

 Description   

Imagine you have fields A & B and a compound index defined as

 { A :1, B:1} 

Now imagine a query

 find( { B : value }).sort(A) 

In 2.4 it appears we would jump to all values of A and within each subset of A skip to entries where B="value" in the index. Leading to a smaller nscannedObjects() value in the query plan.

As of 2.6 it appears indexed plans where the first field in the index is unconstrained are not considered. Which means we end up not using plans in 2.6 that would sometimes get used in 2.4

In this particular case, we end up essentially doing a full index scan. Which depending on data distribution, will most likely be slower than in 2.4.



 Comments   
Comment by James Wahlin [ 19/Oct/21 ]

Closing as a duplicate of SERVER-13197.

It is worth noting for anyone following that we generally recommend ordering index keys in the following priority:
1) Fields used for equality matches
2) Fields used to sort on
3) Fields used to perform range scan

For more on this see the following blog post: https://www.alexbevi.com/blog/2020/05/16/optimizing-mongodb-compound-indexes-the-equality-sort-range-esr-rule/

Comment by David Storch [ 11/Oct/21 ]

I'll mark this for triage. We can consider as a team whether we want to resolve this ticket as a dupe of SERVER-13197.

Comment by Christopher Price [ 16/Oct/14 ]

Use case:
I have a collection with 2.5 million documents. The collection contains a directory of people. Each person/document has a firstName, lastName, zipcode and some other ancillary fields.
I have a customer that wants to find the zipcodes of all people with the firstName of "Adam" or "Eve" and sort the results by lastName.
What is the optimal index(es) for the following queries?

db.directory.find({"firstName":{$in:["Adam","Eve"]}},{"lastName":1,"firstname":1,"zipcode":1}).sort("lastName":1);
...
db.directory.find({"firstName":{$in:["Adam","Eve"]}}).count();
45678

In an odd twist, there are some people that don't have a lastName.

Comment by Christopher Price [ 13/Oct/14 ]

Actually, in the examples I provided that led to this ticket, the index on

{A:1, B:1}

was correctly selected.
The nscanned was the same as the count of the documents in the collection (possibly to be expected).
Additionally, nscannedObjects was also the same as the count of the documents in the collection which was not expected considering that MongoDB should have been able to retrieve and validate B from the index (that part of the query should be covered by the index).

Generated at Thu Feb 08 03:38:33 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.