[SERVER-56132] [SBE] Apparent infinite loop in SBE execution with spool growing infinitely large Created: 15/Apr/21 Updated: 29/Oct/23 Resolved: 22/Apr/21 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Query Execution |
| Affects Version/s: | None |
| Fix Version/s: | 5.0.0-rc0 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | David Storch | Assignee: | David Storch |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
| Backwards Compatibility: | Fully Compatible |
| Operating System: | ALL |
| Sprint: | Query Execution 2021-04-19, Query Execution 2021-05-03 |
| Participants: |
| Description |
|
I stumbled upon this somewhat incidentally when investigating a fuzzer failure, which led me to create a repro script (attached as repro.js) that was based off a fuzzer-generated script. The script just inserts some fuzzer generated data, makes a few indexes, and then runs a fuzzer-generate aggregate command. The command involves a $sort which is pushed down to the SBE layer. There are two candidate plans, both requiring a blocking SORT stage. SBE multi-planning occurs, and the following plan wins, represented below as both a QuerySolutionNode tree and an SBE tree:
The winning plan exited early, but did not yet reach EOF, so the SBE multi-planning code closes and re-opens the plan: The SBE plan appears to loop infinitely during re-opening. It also looks like the spool sp1, used for the generic index scanning algorithm, is growing unbounded in size. I need to dig more into why this is happening, but my current hypothesis is that closing and re-opening the plan is putting it into some unexpected state, and as a result the stack spool-based index scanning algorithm does not terminate. |
| Comments |
| Comment by Githook User [ 23/Apr/21 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
Author: {'name': 'David Storch', 'email': 'david.storch@mongodb.com', 'username': 'dstorch'}Message: | |||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Githook User [ 22/Apr/21 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
Author: {'name': 'David Storch', 'email': 'david.storch@mongodb.com', 'username': 'dstorch'}Message: | |||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Anton Korshunov [ 19/Apr/21 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
david.storch Not closing a child in close() is clearly a bug. When reOpen is false the assumption was that since the tree is opens for the first time, all PlanStages should have been reset to their initial state and there is no need to clear the buffer. But since we do not clear the buffer on close() either, an attempt to open the tree for the second time, after it was closed, could also result in an incorrect state. So, your proposed fix makes total sense. | |||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by David Storch [ 16/Apr/21 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
I believe I now understand why the problem described above results in an infinite loop. The failure to properly close and reopen the union stage results in a bad internal state for the union. Namely, UnionStage::_currentStage is a pointer to the second branch of the union which has the spool consumer and the ixscan stage. However, UnionStage::_currentStageIndex has a value of 0. This means that when stages higher in the tree attempt to read slot s2, the output of the union stage, they always end up reading s3 (the output slot for the first branch) rather than s8 (the output slot for the second branch). The value of s3 is always equal to the constant KeyString value KS(0A0104). Consequently, the spool simply fills up with many copies of the value KS(0A0104). | |||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by David Storch [ 16/Apr/21 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
Aha! I've found the problem. Two problems, actually. The first is that LazySpoolProducerStage::close() fails to close its child stage. This means that when the multi-planning trial period ends with a QueryTrialRunCompleted early exit, the plan is improperly closed. The entire subtree beneath the lspool stage remains open:
When the tree is subsequently re-opened in order to start post-trial execution, the nlj plan is untouched, and is left with some stale state. This somehow causes the execution of the plan to never terminate. I don't fully understand exactly why yet, but this is definitely the root cause of the infinite loop. The other bug is that when the plan is closed and re-opened, sp1 is left with the value KS(0A0104) inside it without being cleared. This is because the spool producer stages are only clearing the spool buffer if reOpen is true:
In this case, reOpen is false because the plan is closed and being fully reopened from scratch. I think we should be clearing the buffer on open no matter what the value of reOpen is? Or, should we actually be clearing the buffer when the spool producer is closed? The following patch appears to resolve the issue:
I'd like to consult with anton.korshunov regarding the semantics for when the spool buffer should be cleared, and regarding options for testing this fix. |