[SERVER-78124] add query count in getPlanCache list, this can help us to get the hot query sql Created: 15/Jun/23  Updated: 15/Oct/23

Status: Backlog
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: y yz Assignee: Backlog - Query Execution
Resolution: Unresolved Votes: 0
Labels: query-product-scope-2, query-product-urgency-3, query-product-value-3, scope
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: PNG File image-2023-06-15-22-46-52-763.png     PNG File image-2023-06-15-22-48-44-889.png    
Assigned Teams:
Query Execution
Operating System: ALL
Participants:

 Description   

now,we can not get the hot query sql.

after add query count, wo can get the hot qeury, this can help us to analysis the hot sql priority .

 

In addition, after add the statistics, we can centralized optimize hot sql index problem. 

 

 

 

 

 



 Comments   
Comment by y yz [ 15/Oct/23 ]

hi, Colby Ing

got it, Thank you very much for your reply.

Comment by Colby Ing [ 13/Oct/23 ]

Hi 1147952115@qq.com - we have a feature coming soon that satisfies this requirement - you can find the documentation here. This should make its way into our self-hosted binaries in the 8.0 release (Mid 2024), does that timeframe work for you? We put this ticket on our backlog, but our team currently does not have the bandwidth to fully test and analyze the code in your PR. Even though it's a small PR, we have to reason about the concurrency model in our plan cache, and determine if it's compatible, which adds a lot of complexity. We will consider this in our future planning.

 

Thanks,

Colby

Comment by y yz [ 13/Oct/23 ]

hi, Colby Ing

 

Do we have the conclusion of this PR(requirement 2.)? Thank you.

 

thanks.

Comment by y yz [ 07/Aug/23 ]

hi, Colby Ing

Yes, these are what we really want.

thanks.

Comment by Colby Ing [ 04/Aug/23 ]

Hi 1147952115@qq.com, thanks - I think I have a better understanding of what you want to do. Ideally - you would like to:

  • Analyze slow queries and look at execution details to see why that query is slow, perhaps you used a wrong index, or no index at all.
  • For those queries that used the wrong index, you would like to force a certain index to that query.
  • Additionally, you would like to get a sense of how big the problem is, by looking at how many queries used that sub-optimal plan. For all queries, you would like to view queries that have high counts.

Please let me know if I've captured the requirements correctly.

We do have a roadmap that addresses many of these use-cases - mainly to be able to analyze query plans and how many times those query plans have been run. Happy to set up a call to talk about you use-case!

 

Thank you,

Colby

Comment by y yz [ 31/Jul/23 ]

hi, Colby Ing

Thank you for your reply. I am sorry that I did not describe this PR clearly.
There are actually two requirements here:

Requirement 1: get hot sql. (not for this PR)
Same kind of query sql count statistics, a kind of query sql count =query1 that  use collScan + query2 that not cached in PlanCache list(only one candidate index) + query3 that cached in PlanCache list(At least two candidate index).

 

 

Requirement 2:  add count statistics in plancache list, with other stastics(for example, works、createdFromQuery、creationExecStats) to help us to analysis candidate index problem and its influence. (for this PR)
1. in online MongoDB cluster, sometimes the sql will run out, the sql selected the wrong index(not best index), so we periodically get the PlanCache list of the corresponding collection every few seconds to analyze the reason. In addition, we shold stastic the query affected during use the wrong index,so we want to know how many queries are affected when use the wrong index.

2. According to the online mongodb statistics, most of the SQL that needs index optimization online is the SQL that has two or more candidate indexes. sometimes, there are many different kind of query sql, so we  need to reference plancache list to carry out high-priority index optimization for this part of SQL.

 

this PR is mainly for the second scenariofor, requirement 2.

Comment by Colby Ing [ 28/Jul/23 ]

Hi 1147952115@qq.com

 

Thank you so much for taking the time to look into this and coming up with a solution. We currently have a feature in development that addresses this use-case. This feature will give you the ability to look at frequently run queries, so you can understand which queries need optimization. I'm happy to schedule a call with you and your team to discuss any details!

Comment by Chris Kelly [ 07/Jul/23 ]

Once again, thanks for the PR! Your contributions are appreciated. I'll pass this to the relevant team to look at the suggested plan cache changes.

Comment by y yz [ 15/Jun/23 ]

the optimize code address:

https://github.com/mongodb/mongo/pull/1556

Comment by y yz [ 15/Jun/23 ]

We have such a problem, When there are a lot of sql  queries,  some query have  more than one candidate index,and have many slow log, maybe index problem, we don't know which index shold be priority optimize.

 

add the counters, we can get hot sql that have multiple candidate index and have inde problem, this may be useful.

Comment by y yz [ 15/Jun/23 ]

befor getPlanCache().list(), as following:

 

 

after add statistics, as following:

 

Generated at Thu Feb 08 06:37:31 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.