[SERVER-42789] Add query execution statistics to slow query log line to help identify efficiency of data access plan Created: 13/Aug/19  Updated: 02/Jan/24

Status: Open
Project: Core Server
Component/s: Aggregation Framework, Diagnostics
Affects Version/s: None
Fix Version/s: 5.0 Desired

Type: Improvement Priority: Major - P3
Reporter: Eric Sommer Assignee: Backlog - Query Execution
Resolution: Unresolved Votes: 6
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Initiative
Related
Assigned Teams:
Query Execution
Sprint: Query 2019-12-16, Query 2019-12-30, Query 2020-01-13, Query 2020-02-10, Query 2020-02-24, Query Execution 2021-03-08, Query Execution 2021-03-22, Query Execution 2021-04-05, Query Execution 2021-04-19, Query Execution 2021-05-03, Query Execution 2021-07-12
Participants:
Case:

 Description   

Currently, the value of nreturned reports the number of documents returned by an aggregation after all of the transformations have been applied. That number is not useful for determining the ratio of scanned / returned for the $cursor. More useful would be knowing, in addition, the number of documents retrieved by $cursor.



 Comments   
Comment by Charlie Swanson [ 02/Jan/24 ]

john.morales@mongodb.com brought up this ticket at a recent 6WR for PM-3268. I wanted to highlight for watchers that:

  1. We will consider this for INIT-377, to discuss at next initiative review
  2. This ticket tracks a very particular ask about $cursor and agg with $match/$group. There are related issues lurking here like how $unwind can similarly change nReturned. It may be better to triage this as an epic (I see PM-2555 is already filed which is great) so that we can have a design review to ensure what we are building will handle the right cases.

cc judah.schvimer@mongodb.com and james.wahlin@mongodb.com 

Comment by David Storch [ 25/Feb/21 ]

A few notes on the POC:

  1. We need to explore how similar behavior can be implemented with the slot-based execution engine (SBE). The approach needs to work for both engines, but the POC predates SBE and is pretty tightly tied to the classic execution engine.
  2. There was a bug (good find julia.oppenheim!) with how nFiltered is reported for getMore operations. In particular, in the POC each getMore reports the cumulative nFiltered value. Instead, it should report the nFiltered quantity in a non-cumulative fashion. This needs to be calculated by subtracting the final nFiltered value from the quantity saved at the beginning of the getMore operation.
Comment by Alejandro Torrealba Villasana (Inactive) [ 02/Dec/20 ]

The next Auto-pilot init sync is scheduled now for Dec 14.

Comment by Alejandro Torrealba Villasana (Inactive) [ 02/Dec/20 ]

rezwan.khan - jus synced with julia.oppenheim. I am scheduling the next Auto-pilot init sync, and we can bring the POC results there, so everybody involved can get it.

CC: craig.homa, ian.whalen

Comment by Rez Khan (Inactive) [ 23/Nov/20 ]

julia.oppenheim is testing this on Wednesday Nov 25th, with aiming to have results to review post Thanksgiving. 

alejandro.torrealba - can you setup some time w/ John, me, Julia to review the results? 

Comment by Alejandro Torrealba Villasana (Inactive) [ 09/Oct/20 ]

ian.whalen , comments were made in the Init Plan to indicate the next steps to test the POC.

Comment by Ian Whalen (Inactive) [ 24/Sep/20 ]

since this is tied into the auto-pilot INIT: ping rezwan.khan for more info on how that testing went?

Comment by Jonathan Balsano [ 17/Jul/20 ]

Hey craig.homa, I believe rezwan.khan did some testing on that patch, I'll let him update here.

Comment by Justin Seyster [ 10/Feb/20 ]

In addition to naming, there are a few new questions.

First, should we reconsider how we categorize documents that get filtered out by a FETCH_STAGE? With the current implementation plan, a document will get marked as not preserved if it is filtered by a FETCH_STAGE. By that standard, the docsScanned:docsPreserved ratio will look the same for a completely covered plan as it does for a plan that has to fetch a lot of documents that ultimately get filtered. I don't think that's a huge problem, though, because we still have the nFetched metric to distinguish cases like that.

Second, will the planned implementation produce reasonable results in the cases of index intersection plans and subquery plans (rooted $or)?

Comment by David Storch [ 10/Feb/20 ]

We've generated some further questions about how this should behave and require further definition from stakeholders in order for us to make progress.

Comment by Asya Kamsky [ 23/Jan/20 ]

> $match that filters documents based on a computed field will not be pushed down

I agree that the metric as you describe it would be correct - the relevant value is how many documents enter the in memory aggregation pipeline relative to nscanned.

Comment by Justin Seyster [ 13/Jan/20 ]

david.storch and I met last week to talk over what measurement we should use as the "docsPreserved" value in the log line. We came up with the idea of choosing a particular PlanStage to consider as the output of the "data access method" and counting the number of documents in that stage's output to determine what "docsPreserved" should be. For each query, the planner establishes the "data access method" portion of the plan with one of the public functions planner_access.h.

Our plan is to modify the planner so that those functions mark the root QuerySolutionNode of the returned query plan as the data access output. That designation will carry over to the corresponding PlanStage, which will increment the "docsPreserved" measurement for each document it outputs. In aggregation plans, there will never be a $group, $unwind, or similar pipeline stage within the "data access method" subtree of the plan, so we'll be able to avoid the apples-to-oranges comparison problem that comes with using the ratio of documents scanned / documents returned when a returned document represents multiple input documents (as with $group) or a fraction of an input document (as with $unwind).

In most cases, this definition of "docsPreserved" will give the exact same results as counting the number of documents out of $cursor, but it will still apply even if future versions of the query planner blur the line between what's considered part of $cursor and what's considered part of the remaining pipeline.

We still have one open question about how to count documents. Is it the correct behavior that our current implementation will count a document as "preserved" even if it gets filtered out by a $match that was not "pushed down" into the data access plan? In particular, a $match that filters documents based on a computed field will not be pushed down. We are of the opinion that this behavior makes sense: we're primarily interested in how efficient the chosen index was, and we don't expect that a better index choice will improve a query that's filtering out documents based on a computed field. I'm interested in any opinions on this, though.

Comment by David Storch [ 19/Aug/19 ]

Thanks for filing this request eric.sommer! The $cursor stage is an implementation detail. I am very reluctant to include information on the log line which is closely coupled with the particulars of the existing execution engine. Furthermore, the query team is planning a substantial refactor in order to unify the execution code used for aggregate with the execution engine used for find. As a consequence of this refactor, we are likely to eliminate the $cursor stage from the execution of most aggregate operations.

After speaking with christopher.harris, my understanding is that getting a sense of the efficiency or selectivity of the data access plan (i.e. any IXSCAN or FETCH stages) prior to any subsequent $group, $unwind, $lookup, etc. is at the heart of this request. Perhaps we could instead report the percentage or count of documents examined which were preserved by a filter? That way the log line, through the combination of keysExamined, docsExamined, docsPreserved, and nReturned would give you a good snapshot of the query's performance. For example, query could examine 100 keys, fetch 50 documents, filter out 10 of these resulting in docsPreserved=40, and then return 20 due to a subsequent $group. Would that satisfy this improvement request?

As a final note, this information should already be available in explain() and will be improved substantially by SERVER-21784, which we are hoping to address in 4.4. However, I imagine that for real-world support and debugging scenarios, having this information readily available in the log line is also important.

Comment by Craig Homa [ 19/Aug/19 ]

Assigning to david.storch so he can review and update the description and summary.

Generated at Thu Feb 08 05:01:26 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.