[SERVER-2454] Queries that are killed during a yield should return error to user instead of partial result set Created: 01/Feb/11  Updated: 26/Feb/16  Resolved: 09/Jun/15

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: 3.0.7
Fix Version/s: 2.6.12, 3.0.8, 3.1.5

Type: Bug Priority: Critical - P2
Reporter: Aaron Staple Assignee: James Wahlin
Resolution: Done Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File hasnext_repro.js    
Issue Links:
Depends
is depended on by SERVER-18493 Add FSM test interleaving collection/... Backlog
Documented
is documented by DOCS-5590 find/getMore will no longer return pa... Closed
Duplicate
is duplicated by SERVER-20973 Initial sync during index drop can ca... Closed
is duplicated by SERVER-14752 Btree cursor returns partial result w... Closed
is duplicated by SERVER-11960 Better handling of PlanExecutor::FAIL... Closed
is duplicated by SERVER-5169 partial result set rather than assert... Closed
is duplicated by SERVER-12689 Improve log message when capped curso... Closed
Related
related to SERVER-22195 queryoptimizer3.js failing on 2.6 Closed
related to SERVER-2816 invalidating cursors for one collecti... Closed
related to SERVER-13123 All callers of PlanExecutor::getNext ... Closed
related to SERVER-16920 Better error messages for operations ... Closed
is related to DOCS-1236 document cursor lifecycle Closed
Backwards Compatibility: Major Change
Operating System: ALL
Backport Completed:
Sprint: Quint Iteration 3.1.2, Quint Iteration 3, Quint Iteration 4, Quint Iteration 5
Participants:
Linked BF Score: 0

 Description   

While a query is yielded, it may be killed for a number of reasons, including:

  • collection drop
  • database drop
  • index drop

Killed queries that have generated a partial result set will return these partial results without returning an error. We should consider having killed queries fail with a useful error message.



 Comments   
Comment by Githook User [ 12/Nov/15 ]

Author:

{u'username': u'jameswahlin', u'name': u'James Wahlin', u'email': u'james.wahlin@10gen.com'}

Message: SERVER-2454 Fix invalid object use on PlanExecutor::DEAD
Branch: v2.6
https://github.com/mongodb/mongo/commit/fd4e416bb12a604fbbc4c5ba7a6dd58e136115e1

Comment by Githook User [ 02/Nov/15 ]

Author:

{u'username': u'jameswahlin', u'name': u'James Wahlin', u'email': u'james.wahlin@10gen.com'}

Message: SERVER-2454 Backport of PlanExecutor::DEAD handling fix to 2.6.x
Branch: v2.6
https://github.com/mongodb/mongo/commit/c980c02f219e595279e2b81f558ec1c81e841573

Comment by Githook User [ 23/Oct/15 ]

Author:

{u'username': u'jameswahlin', u'name': u'James Wahlin', u'email': u'james.wahlin@10gen.com'}

Message: SERVER-2454 Return error on find/getMore PlanExecutor::DEAD

This is a partial backport to the 3.0 branch. It is intended to fix the
issue reported in SERVER-20973.
Branch: v3.0
https://github.com/mongodb/mongo/commit/817c963bea7f12220deee14ecaa512161d39555a

Comment by Githook User [ 08/Jul/15 ]

Author:

{u'username': u'dstorch', u'name': u'David Storch', u'email': u'david.storch@10gen.com'}

Message: SERVER-2454 fix resync.js test to handle collection scan dying due to its position in a capped collection being deleted
Branch: master
https://github.com/mongodb/mongo/commit/dd3c12c2e331cd423319278e751ba38b7c5437ef

Comment by Githook User [ 08/Jul/15 ]

Author:

{u'username': u'dstorch', u'name': u'David Storch', u'email': u'david.storch@10gen.com'}

Message: SERVER-2454 improve error messages for CollectionScan DEAD cases
Branch: master
https://github.com/mongodb/mongo/commit/332781c598c51e19001e05727679901a94c7fe27

Comment by Githook User [ 09/Jun/15 ]

Author:

{u'name': u'James Wahlin', u'email': u'james.wahlin@10gen.com'}

Message: SERVER-2454 Remove trailing whitespace
Branch: master
https://github.com/mongodb/mongo/commit/0e19bd8c7f5a39fff85fe067260d434cda5bb6a4

Comment by Githook User [ 09/Jun/15 ]

Author:

{u'name': u'James Wahlin', u'email': u'james.wahlin@10gen.com'}

Message: SERVER-2454 Improve PlanExecutor::DEAD handling
Branch: master
https://github.com/mongodb/mongo/commit/d690653daadef98652e58131ade8b34114f86ab2

Comment by J Rassi [ 01/Apr/15 ]

The work for this ticket is to replace all instances of PlanExecutor::DEAD with PlanExecutor::FAILURE. In particular: when a PlanExecutor is killed, subsequent calls to PlanExecutor::getNext() should return PlanExecutor::FAILURE with an error Status instead of PlanExecutor::DEAD.

After a PlanExecutor is killed, it is invalid to examine the state of the collection (or any other storage) associated with the executor. Care should be taken to ensure that any existing code which treats PlanExecutor::DEAD specially does not perform any invalid operations (e.g. dereferencing the executor's collection pointers) after this change is made. Overrides of PlanStage::getStats() should get a particularly close inspection.

Comment by David Storch [ 17/Mar/15 ]

We should consider collapsing PlanExecutor::FAILURE and PlanExecutor::DEAD into one; i.e. we should not distinguish between query failure and a query being killed by a concurrent operation during a yield.

Comment by Andrew Emil (Inactive) [ 13/Jan/15 ]

attached reproduction script

Comment by Aaron Staple [ 11/Feb/13 ]

The client cursors used internally should use the QueryOption_NoCursorTimeout, in which case timeouts shouldn't happen.

Comment by Ben Becker [ 11/Feb/13 ]

Could this also occur if a cursor times out during a yield? For example, if a MapReduce job runs, and a user executes an atomic operation that runs for ~10 minutes?

Comment by Eliot Horowitz (Inactive) [ 01/Feb/11 ]

Low, just go by jira.

yes, this ticket is a good place to do so.

Comment by Aaron Staple [ 01/Feb/11 ]

I can investigate all the causes - how high a priority is that?

One other note - if we err out in some cases we need to be carefull about how best plans are recorded. I had added a todo comment about this but it is no longer there.

Comment by Eliot Horowitz (Inactive) [ 01/Feb/11 ]

I think the correct fix will be keeping track of why a cursor is dropped.
Causes:

  • index drop - error
  • collection drop - return partial results
  • db drop - partial drops
  • close db - error

what else?

Generated at Thu Feb 08 03:00:01 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.