[SERVER-83509] idhack queries are incorrectly counted in serverStatus() as classic plan cache misses Created: 21/Nov/23  Updated: 07/Dec/23  Resolved: 07/Dec/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: David Storch Assignee: Backlog - Query Optimization
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates SERVER-75678 Consider adding a plan cache serverSt... Closed
Related
is related to SERVER-70025 add serverStatus metrics planCacheHit... Closed
Assigned Teams:
Query Optimization
Operating System: ALL
Participants:

 Description   

In SERVER-70025, we added serverStatus() metrics to count plan cache hits and misses for both the classic and SBE plan caches:

> db.serverStatus().metrics.query.planCache
{
	"classic" : {
		"hits" : NumberLong(0),
		"misses" : NumberLong(4)
	},
	"sbe" : {
		"hits" : NumberLong(0),
		"misses" : NumberLong(0)
	}
}

As of this writing, find-by-_id queries (a.k.a. "idhack" queries) never use either the classic or SBE plan cache. This is for performance reasons – since we have a special fast path for idhack, skipping the plan cache is actually beneficial.

The problem I observe is that even though idhack queries will never use the plan cache, they are counted as classic plan cache misses. Here's a simple example of the problem, reproduced against a standalone 7.0 server, demonstrating that the plan cache misses counter is incremented by 1 for every idhack query:

> db.serverStatus().metrics.query.planCache.classic.misses
NumberLong(5)
> db.c.find({_id: 2})
{ "_id" : 2 }
> db.serverStatus().metrics.query.planCache.classic.misses
NumberLong(6)
> db.c.find({_id: 2})
{ "_id" : 2 }
> db.serverStatus().metrics.query.planCache.classic.misses
NumberLong(7)

Looking at the code, the problem is due to this line. The fix may be as simple as deleting this line of code.

A high rate of plan cache misses (either SBE or classic) could be indicative of a performance problem, since in many workloads we expect the plan cache to be warm and used to serve most queries. The fact that idhack queries are happening and not using the classic plan cache, on the other hand, is perfectly normal. Surfacing these idhack queries as plan cache misses could suggest a problem to server engineers or support engineers where there is none.


Generated at Thu Feb 08 06:52:24 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.