[SERVER-24755] explain("executionStats") can attempt to access a collection after it has been dropped Created: 23/Jun/16 Updated: 19/Nov/16 Resolved: 26/Sep/16 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Querying |
| Affects Version/s: | 3.3.8 |
| Fix Version/s: | 3.3.15 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Isabella Siu (Inactive) | Assignee: | David Storch |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||
| Operating System: | ALL | ||||||||
| Sprint: | TIG 16 (06/24/16), Query 2016-10-10 | ||||||||
| Participants: | |||||||||
| Linked BF Score: | 0 | ||||||||
| Description |
|
Calling explain() while running with concurrent clients can cause a segmentation fault or aborting.
|
| Comments |
| Comment by Githook User [ 26/Sep/16 ] | |||||
|
Author: {u'username': u'dstorch', u'name': u'David Storch', u'email': u'david.storch@10gen.com'}Message: | |||||
| Comment by David Storch [ 26/Sep/16 ] | |||||
|
I can reproduce this crash reliably by running the following commands in separate shells connected to the same server:
I initially suspected that this was due to a data race inside the PlanCache, but this turned out not to be the case. Instead, this is a bug in explain which can cause us to attempt to access the PlanCache of a collection that has already been dropped. If the explain verbosity is "executionStats" or higher, we execute the query in order to gather the requested statistics: https://github.com/mongodb/mongo/blob/r3.3.14/src/mongo/db/query/explain.cpp#L719-L723 If execution of the query fails, explain should report this by including a value of false inside the executionSuccess field. The explain command itself, however, does not fail. When the error status is due to the collection being dropped during a yield, we still attempt to access the collection in order to obtain a value for the indexFilterSet field: https://github.com/mongodb/mongo/blob/r3.3.14/src/mongo/db/query/explain.cpp#L595-L606 In order to fix, we should not attempt to gather a value for indexFilterSet when PlanExecutor::executePlan() returns with ErrorCodes::QueryPlanKilled. | |||||
| Comment by Isabella Siu (Inactive) [ 27/Jun/16 ] | |||||
|
david.storch It's okay, I'm not actively investigating it. | |||||
| Comment by David Storch [ 27/Jun/16 ] | |||||
|
It looks like may be a data race inside the PlanCache introduced during 3.2 development for partial indexes. Nice find, isabella.siu! Is it okay if I move this onto the query team's backlog, or are you actively investigating? |