-
Type:
Improvement
-
Resolution: Duplicate
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
Query Optimization
-
None
-
None
-
None
-
None
-
None
-
None
-
None
When a plan contains $or, we exhaust a single branch before we move on to the next.
When used in multi-planning, it leads to the whole plan being estimated on the selectivity of the first branch.
The real-life-like example of this is the following:
Consider a collection that has some time segments in format {start: X, end: Y}.
If the event is not finished yet, the end field will be null or missing.
If we want to query all events that are happening at time T, than the query may look like this:
db.test.find({start: {$lte: T}, $or: [{end: null}, {end: {$gte: T}}]})
Given the index {start: 1, end: 1} the current optimizer will generate 2 plans:
IXSCAN {
start: [ '[-inf.0, 500]' ],
end: [ '[MinKey, MaxKey]' ]
}
This is not very efficient plans, as it forces a full scan of "end" part of the index.
AndÂ
OR
FETCH {end: {$eq: null}}
IXSCAN {
start: [ '[-inf.0, T]' ],
end: [ '[undefined, undefined]', '[null, null]' ]
}
IXSCAN {
start: [ '[-inf.0, 500]' ],
end: [ '[500, inf.0]' ]
}
This plan is much more efficient, because it skips a lot of "end" values, but the first branch of this OR plan is not, because it is forced to seek only segments with no ends and only then it will go on to search for events with an end.
Here is a script that will show that multi-planner will prefer first plan to the second: repro.js![]()
- duplicates
-
SERVER-46904 Efficiency of one $or branch during planning may not be representative of overall plan efficiency
-
- Backlog
-
- is related to
-
SERVER-46904 Efficiency of one $or branch during planning may not be representative of overall plan efficiency
-
- Backlog
-