[SERVER-52917] Improve reporting of save/restoreState metrics in allPlansExecution Created: 17/Nov/20  Updated: 06/Dec/22

Status: Backlog
Project: Core Server
Component/s: Querying
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Chris Harris Assignee: Backlog - Query Execution
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
Assigned Teams:
Query Execution
Participants:

 Description   

The saveState and restoreState metrics for some entries in the allPlansExecution array of verbose explain output reflect those computed during the overall execution of the winning plan.  This is misleading as the entries in that array are supposed to reflect the metrics from the trial phase only.  



 Comments   
Comment by David Storch [ 17/Nov/20 ]

Regarding the root cause, I suspect what is happening is that all n candidate plans are added as children of the MultiPlanStage during the trial period via calls to MultiPlanStage::addPlan(). After the trial period completes and the winning plan is selected, the rejected candidates stay in place as children of the MultiPlanStage but are not executed any further. When yields occur, saveState()/restoreState() calls are propagated to the entire tree, including the lingering children of the MultiPlanStage corresponding to rejected plans. As a result, the rejected plans display the same yield count as the winning plan.

One way to fix this would be to deallocate the rejected plans after the trial period ends (though runtime modification of the plan tree might be undesirable). Another more invasive approach would be to change the architecture so that the multi-planner is not implemented as a PlanStage; this is the approach taken by the SBE engine and it is also suggested by SERVER-16894. A third possibility would be to change the explain code to capture the execution stats of the rejected plans immediately after the trial period ends, as opposed to waiting to capture these stats until the winning plan has finished running.

Generated at Thu Feb 08 05:29:23 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.