[SERVER-15572] Explain reports same plan with and without hint, but without is 6x slower Created: 08/Oct/14  Updated: 10/Dec/14  Resolved: 09/Oct/14

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: 2.6.3
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Doug Mayer Assignee: David Storch
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Operating System: ALL
Participants:

 Description   

The following explain plans appear to be exactly the same (save for "all plans" counts, which would make sense). However, the first does not use a hint, and the second does use a hint. All parameters reported by explain are the same, though. The one without a hint takes about 650ms, the one with a hint takes about 110ms.

main:SECONDARY> db.users.find({
...   account_id: ObjectId('513778770cf200dd51234567'),
...   deleted_at: { $exists: false },
...   _type: { $in: [ "Staff" ] }
... }).sort({ user_name: 1 }).limit(20).explain()
{
	"cursor" : "BtreeCursor account_id_1_user_name_1",
	"isMultiKey" : false,
	"n" : 20,
	"nscannedObjects" : 19863,
	"nscanned" : 19864,
	"nscannedObjectsAllPlans" : 149646,
	"nscannedAllPlans" : 158905,
	"scanAndOrder" : false,
	"indexOnly" : false,
	"nYields" : 1241,
	"nChunkSkips" : 0,
	"millis" : 647,
	"indexBounds" : {
		"account_id" : [
			[
				ObjectId("513778770cf200dd51234567"),
				ObjectId("513778770cf200dd51234567")
			]
		],
		"user_name" : [
			[
				{
					"$minElement" : 1
				},
				{
					"$maxElement" : 1
				}
			]
		]
	},
	"server" : "db02:27017",
	"filterSet" : false
}
 
main:SECONDARY> db.users.find({
...   account_id: ObjectId('513778770cf200dd51234567'),
...   deleted_at: { $exists: false },
...   _type: { $in: [ "Staff" ] }
... }).sort({ user_name: 1 }).limit(20).hint({account_id:1, user_name: 1}).explain()
{
	"cursor" : "BtreeCursor account_id_1_user_name_1",
	"isMultiKey" : false,
	"n" : 20,
	"nscannedObjects" : 19863,
	"nscanned" : 19864,
	"nscannedObjectsAllPlans" : 19863,
	"nscannedAllPlans" : 19864,
	"scanAndOrder" : false,
	"indexOnly" : false,
	"nYields" : 155,
	"nChunkSkips" : 0,
	"millis" : 111,
	"indexBounds" : {
		"account_id" : [
			[
				ObjectId("513778770cf200dd51234567"),
				ObjectId("513778770cf200dd51234567")
			]
		],
		"user_name" : [
			[
				{
					"$minElement" : 1
				},
				{
					"$maxElement" : 1
				}
			]
		]
	},
	"server" : "db02:27017",
	"filterSet" : false
}



 Comments   
Comment by David Storch [ 09/Oct/14 ]

Hi doug@ebackpack.com,

Thanks very much for the bug report! This is expected behavior, so I am going to close as Works as Designed. Read on for an explanation of why.

Adding .hint() to a query in most cases causes the query planner to construct just one query plan, using the hinted index. This bypasses query optimization, and allows execution of the hinted query plan to begin immediately.

Without .hint(), the query planner explores a potentially large space of possible query plans. (In the simplest case, this means one plan per index that could be used to answer the query.) There is overhead involved not only in exploring the plan space and generating the set of candidate plans, but also in query optimization---that is, the process of selecting the fastest plan. MongoDB's query optimizer works by running candidate plans for a trial period, and using observations about this partial query execution to infer which plan is best. The problematic query originally reported in this ticket has a lot of matching index keys (approximately 20,000) but only 20 matching documents. This low n-to-nscanned ratio means that the query optimizer has limited information about which plan is performing best and will extend the trial period in order to ensure that there is sufficient information for selecting a good plan. The consequence is that the amount of work performed by the system during the trial period is multiplied by the number of candidate plans. If there are n candidates, and each scans k index keys during the trial period, then nk keys get scanned in total. This is why it is possible for there to be a 6x slowdown for .explain() without .hint().

The fact that query optimization can be expensive is offset by the query system's use of a plan cache. When you run a query without .explain(), the query system will explore the plan space, run candidate plans for a trial period in order to find the optimal plan, and then complete execution of the query using the winning plan. The winning plan is then entered into the plan cache and re-used on subsequent executions of queries with the same shape. (Queries with .explain() neither read from nor write to the plan cache.) This means that the cost of query optimization is generally amortized over many executions, until the plan cache entry is evicted due to writes changing the data distribution.

I hope this explanation was helpful, and please feel free to reach out with any further questions or bug reports.

Best,
Dave

Generated at Thu Feb 08 03:38:23 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.