Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Unresolved
Priority: Major - P3
Fix Version/s: None
Affects Version/s: 2.4.8, 2.6.0-rc0
Component/s: Querying
Labels:

Assigned Teams:

Query Optimization
Operating System:
ALL
Case:
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

Currently we rank plans by work()'ing them some number of times and seeing who produces the most results.

If there is a plan with a blocking stage, it will therefore almost always be ranked lower than plans without a blocking stage. This is because plans without a blocking stage immediately start producing results. Although the plan with the blocking stage doesn't appear to be making progress immediately, it may end up looking much better / hitting EOF much faster if we were to run the plans to completion.

Blocking stages for which this is an issue:

SORT
AND_HASH

Here's one way to reproduce the issue, for the blocking SORT stage:

> t = db.t
test.t
> t.drop()
true
> t.ensureIndex({a: 1})
WriteResult({ "nInserted" : 1 })
> t.ensureIndex({b: 1})
WriteResult({ "nInserted" : 1 })
> for (var i = 0; i < 10000; i++) { t.save({a: 1}); }
WriteResult({ "nInserted" : 1 })
> for (var i = 0; i < 150; i++) { t.save({b: 1}); }
WriteResult({ "nInserted" : 1 })
> t.find({b: 1}).sort({a: 1}).explain()
{
	"cursor" : "BtreeCursor a_1",
	"isMultiKey" : false,
	"n" : 150,
	"nscannedObjects" : 10150,
	"nscanned" : 10150,
	"nscannedObjectsAllPlans" : 10347,
	"nscannedAllPlans" : 10348,
	"scanAndOrder" : false,
	"indexOnly" : false,
	"nYields" : 85,
	"nChunkSkips" : 0,
	"millis" : 139,
	"indexBounds" : {
		"a" : [
			[
				{
					"$minElement" : 1
				},
				{
					"$maxElement" : 1
				}
			]
		]
	},
	"server" : "localhost:27017",
	"filterSet" : false
}

The query ends up using index {a: 1}, which involves scanning 10,000 docs and fetching them from disk if they're not in memory. We would have liked to use index {b: 1}, which would involve fetching only 150 docs and then sorting them in memory.

is duplicated by

SERVER-13866 Wrong index is being picked

Closed

SERVER-42733 Query not using index intersection

Closed

is related to

SERVER-13831 Find with $gte/$gt/$lte/$lt on Date field and sorting on another field leads to unnecessary in-memory sort without using index

Closed

SERVER-20619 Statistics-based query optimization

Backlog

Assignee:: [DO NOT USE] Backlog - Query Optimization
Reporter:: David Storch
Participants:: [DO NOT USE] Backlog - Query Optimization, Adrien Jarthon, David Storch
Votes:: 12 Vote for this issue
Watchers:: 41 Start watching this issue

Created:: Feb 26 2014 09:57:01 PM UTC
Updated:: Mar 26 2025 03:17:27 PM UTC

Details

Description

Attachments

Issue Links

Forms

Activity

People

Dates