[SERVER-82677] Deduplicate index scan + fetch plans guaranteed to have similar performance Created: 01/Nov/23  Updated: 07/Feb/24

Status: In Code Review
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Matt Boros Assignee: Matt Boros
Resolution: Unresolved Votes: 0
Labels: query-director-triage
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
depends on SERVER-86251 Cleanup plan enumerator in preparatio... In Code Review
Related
related to SERVER-82549 MongoDB 7.0.2 SBE query slow when dir... Closed
is related to SERVER-62150 SBE Multiplanning can be slow when su... Open
Assigned Teams:
Query Optimization
Participants:

 Description   

In SERVER-82549 we have a query where we consider 14 plans with the same structure (index scan + fetch + filter). The index scans are technically different, but are guaranteed to have the same performance, and return the same RIDs:

Plan 1

filter
|
seek
|
index scan
X: [100, 100],
Y: [Minkey, Maxkey]

Plan 2

filter
|
seek
|
index scan
X: [100, 100],
Z: [Minkey, Maxkey]

Plan 3

filter
|
seek
|
index scan
X: [100, 100],
Y: [Minkey, Maxkey],
Z: [Minkey, Maxkey]

The non-open bounds in these index scans are identical. They will produce the same RIDs. For queries without a sort we should consider these duplicates and only trial one of them, preferably the one with the least number of indexed fields.

We've had a number of issues recently where multiplanning is slow. Removing identical plans is one way to help fix this, and generally improve multiplanner performance.



 Comments   
Comment by David Percy [ 29/Jan/24 ]

Another detail to consider is that if Y or Z is multikey, then these different plans may give you a different number of rows (duplicate RIDs) even though the set of RID values is the same.

Comment by Matt Boros [ 01/Nov/23 ]

Maybe a more accurate description is that we have common non-open prefix for these index bounds.

I wonder if we can take this a bit further. If I have indexes {a: 1, b: 1}, {b: 1, a: 1} and a query with predicates a=2 and b=2, is there a point in considering both of these indexes? Maybe we could we consider index scan - fetch - filter plans duplicates if the index scans satisfy the same set of prefixes. Never mind, as max.hirschhorn@mongodb.com pointed out to me one index scan might be more efficient depending on the selectivity of the a and b predicates. So this ticket is more like "deduplicating plans where the index bound non-minkey-maxkey prefix are the same". This is a confusing description but I think the example demonstrates my point well.

Generated at Thu Feb 08 06:49:57 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.