[SERVER-58356] Cannot kill the dataSize operation Created: 07/Jul/21 Updated: 29/Oct/23 Resolved: 23/Sep/21 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | 4.4.6 |
| Fix Version/s: | 5.1.0-rc0 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Ivan Grigolon | Assignee: | Denis Grebennicov |
| Resolution: | Fixed | Votes: | 2 |
| Labels: | query-director-triage | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||||||||||||||
| Steps To Reproduce: | Testing:
Create a significant amount of data so that you can run the datasize command and you have some time to attempt to kill the operation.
Find and kill the op (as you see the op does not get killed)
From the logs:
|
||||||||||||||||||||||||||||||||
| Sprint: | QE 2021-08-09, QE 2021-08-23, QE 2021-09-20, QE 2021-10-04 | ||||||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||||||
| Case: | (copied to CRM) | ||||||||||||||||||||||||||||||||
| Description |
|
Cannot kill the dataSize operation, see steps below to reproduce. |
| Comments |
| Comment by Githook User [ 21/Sep/21 ] | ||||||||||||||||
|
Author: {'name': 'Denis Grebennicov', 'email': 'denis.grebennicov@mongodb.com', 'username': 'denis631'}Message: | ||||||||||||||||
| Comment by Kyle Suarez [ 24/Aug/21 ] | ||||||||||||||||
|
After discussion in Query Execution triage, we suspect that the original author of the command wrote it without yielding because they wanted to guarantee an accurate data size. If the query were to yield, resuming the data size operation means that the final size could potentially return a number that doesn't represent the actual size of the collection at any point in time. ian.boros suggests that changing the yield policy here to INTERRUPT_ONLY would allow the operation to be interruptible without forcing it to yield, so we could still get an accurate count otherwise. Moving this into the Quick Wins bucket. When we pick this up we need to ensure that any change to the yield policy has the intended interruptible effect without affecting data size correctness when the command is not interrupted. | ||||||||||||||||
| Comment by Kyle Suarez [ 20/Aug/21 ] | ||||||||||||||||
|
After a quick chat with Benety and some code examination, I am inclined to think this code does lie in the Query Execution realm: During the execution of the command, we initiate collection or index scans with PlanYieldPolicy::YieldPolicy::NO_YIELD. We then manually iterate over the PlanExecutor in a loop and not once check for interrupt:
Sending back to the Query Execution triage queue for scheduling and discussing the viability of
| ||||||||||||||||
| Comment by Ana Meza [ 19/Aug/21 ] | ||||||||||||||||
|
Kyle, could you please find out who own the data size command |