Details
-
Bug
-
Status: Backlog
-
Major - P3
-
Resolution: Unresolved
-
2.4.8, 2.6.0-rc0
-
None
-
Query Optimization
-
ALL
-
(copied to CRM)
Description
Currently we rank plans by work()'ing them some number of times and seeing who produces the most results.
If there is a plan with a blocking stage, it will therefore almost always be ranked lower than plans without a blocking stage. This is because plans without a blocking stage immediately start producing results. Although the plan with the blocking stage doesn't appear to be making progress immediately, it may end up looking much better / hitting EOF much faster if we were to run the plans to completion.
Blocking stages for which this is an issue:
- SORT
- AND_HASH
Here's one way to reproduce the issue, for the blocking SORT stage:
> t = db.t
|
test.t
|
> t.drop()
|
true
|
> t.ensureIndex({a: 1})
|
WriteResult({ "nInserted" : 1 })
|
> t.ensureIndex({b: 1})
|
WriteResult({ "nInserted" : 1 })
|
> for (var i = 0; i < 10000; i++) { t.save({a: 1}); }
|
WriteResult({ "nInserted" : 1 })
|
> for (var i = 0; i < 150; i++) { t.save({b: 1}); }
|
WriteResult({ "nInserted" : 1 })
|
> t.find({b: 1}).sort({a: 1}).explain()
|
{
|
"cursor" : "BtreeCursor a_1",
|
"isMultiKey" : false,
|
"n" : 150,
|
"nscannedObjects" : 10150,
|
"nscanned" : 10150,
|
"nscannedObjectsAllPlans" : 10347,
|
"nscannedAllPlans" : 10348,
|
"scanAndOrder" : false,
|
"indexOnly" : false,
|
"nYields" : 85,
|
"nChunkSkips" : 0,
|
"millis" : 139,
|
"indexBounds" : {
|
"a" : [
|
[
|
{
|
"$minElement" : 1
|
},
|
{
|
"$maxElement" : 1
|
}
|
]
|
]
|
},
|
"server" : "localhost:27017",
|
"filterSet" : false
|
}
|
The query ends up using index {a: 1}, which involves scanning 10,000 docs and fetching them from disk if they're not in memory. We would have liked to use index {b: 1}, which would involve fetching only 150 docs and then sorting them in memory.
Attachments
Issue Links
- is duplicated by
-
SERVER-13866 Wrong index is being picked
-
- Closed
-
-
SERVER-42733 Query not using index intersection
-
- Closed
-
- is related to
-
SERVER-13831 Find with $gte/$gt/$lte/$lt on Date field and sorting on another field leads to unnecessary in-memory sort without using index
-
- Closed
-
-
SERVER-20619 Statistics-based query optimization
-
- Backlog
-