[SERVER-2454] Queries that are killed during a yield should return error to user instead of partial result set Created: 01/Feb/11 Updated: 26/Feb/16 Resolved: 09/Jun/15 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Querying |
| Affects Version/s: | 3.0.7 |
| Fix Version/s: | 2.6.12, 3.0.8, 3.1.5 |
| Type: | Bug | Priority: | Critical - P2 |
| Reporter: | Aaron Staple | Assignee: | James Wahlin |
| Resolution: | Done | Votes: | 1 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Backwards Compatibility: | Major Change | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Backport Completed: | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Sprint: | Quint Iteration 3.1.2, Quint Iteration 3, Quint Iteration 4, Quint Iteration 5 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Linked BF Score: | 0 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Description |
|
While a query is yielded, it may be killed for a number of reasons, including:
Killed queries that have generated a partial result set will return these partial results without returning an error. We should consider having killed queries fail with a useful error message. |
| Comments |
| Comment by Githook User [ 12/Nov/15 ] |
|
Author: {u'username': u'jameswahlin', u'name': u'James Wahlin', u'email': u'james.wahlin@10gen.com'}Message: |
| Comment by Githook User [ 02/Nov/15 ] |
|
Author: {u'username': u'jameswahlin', u'name': u'James Wahlin', u'email': u'james.wahlin@10gen.com'}Message: |
| Comment by Githook User [ 23/Oct/15 ] |
|
Author: {u'username': u'jameswahlin', u'name': u'James Wahlin', u'email': u'james.wahlin@10gen.com'}Message: This is a partial backport to the 3.0 branch. It is intended to fix the |
| Comment by Githook User [ 08/Jul/15 ] |
|
Author: {u'username': u'dstorch', u'name': u'David Storch', u'email': u'david.storch@10gen.com'}Message: |
| Comment by Githook User [ 08/Jul/15 ] |
|
Author: {u'username': u'dstorch', u'name': u'David Storch', u'email': u'david.storch@10gen.com'}Message: |
| Comment by Githook User [ 09/Jun/15 ] |
|
Author: {u'name': u'James Wahlin', u'email': u'james.wahlin@10gen.com'}Message: |
| Comment by Githook User [ 09/Jun/15 ] |
|
Author: {u'name': u'James Wahlin', u'email': u'james.wahlin@10gen.com'}Message: |
| Comment by J Rassi [ 01/Apr/15 ] |
|
The work for this ticket is to replace all instances of PlanExecutor::DEAD with PlanExecutor::FAILURE. In particular: when a PlanExecutor is killed, subsequent calls to PlanExecutor::getNext() should return PlanExecutor::FAILURE with an error Status instead of PlanExecutor::DEAD. After a PlanExecutor is killed, it is invalid to examine the state of the collection (or any other storage) associated with the executor. Care should be taken to ensure that any existing code which treats PlanExecutor::DEAD specially does not perform any invalid operations (e.g. dereferencing the executor's collection pointers) after this change is made. Overrides of PlanStage::getStats() should get a particularly close inspection. |
| Comment by David Storch [ 17/Mar/15 ] |
|
We should consider collapsing PlanExecutor::FAILURE and PlanExecutor::DEAD into one; i.e. we should not distinguish between query failure and a query being killed by a concurrent operation during a yield. |
| Comment by Andrew Emil (Inactive) [ 13/Jan/15 ] |
|
attached reproduction script |
| Comment by Aaron Staple [ 11/Feb/13 ] |
|
The client cursors used internally should use the QueryOption_NoCursorTimeout, in which case timeouts shouldn't happen. |
| Comment by Ben Becker [ 11/Feb/13 ] |
|
Could this also occur if a cursor times out during a yield? For example, if a MapReduce job runs, and a user executes an atomic operation that runs for ~10 minutes? |
| Comment by Eliot Horowitz (Inactive) [ 01/Feb/11 ] |
|
Low, just go by jira. yes, this ticket is a good place to do so. |
| Comment by Aaron Staple [ 01/Feb/11 ] |
|
I can investigate all the causes - how high a priority is that? One other note - if we err out in some cases we need to be carefull about how best plans are recorded. I had added a todo comment about this but it is no longer there. |
| Comment by Eliot Horowitz (Inactive) [ 01/Feb/11 ] |
|
I think the correct fix will be keeping track of why a cursor is dropped.
what else? |