[SERVER-22537] segfault running aggregation query Created: 09/Feb/16 Updated: 19/Nov/16 Resolved: 29/Feb/16 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Aggregation Framework |
| Affects Version/s: | 3.2.1 |
| Fix Version/s: | 3.2.4, 3.3.3 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Adinoyi Omuya | Assignee: | Max Hirschhorn |
| Resolution: | Done | Votes: | 0 |
| Labels: | code-and-test | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||
| Issue Links: |
|
||||
| Backwards Compatibility: | Fully Compatible | ||||
| Operating System: | ALL | ||||
| Backport Completed: | |||||
| Steps To Reproduce: | 1. Get the data 2. Restore using mongorestore --gzip --archive=attendees.bson.gz and mongorestore --gzip --archive=flights201406.bson.gz 3. Run the aggregation query below on the attendees collection. |
||||
| Sprint: | Query 10 (02/22/16), Query 11 (03/14/16) | ||||
| Participants: | |||||
| Linked BF Score: | 0 | ||||
| Description |
|
Aggregation pipeline (on tableau database):
Backtrace (sorry, demangler.com is down):
Ran this on OSX 10.11.2 (15C50) using the following mongod:
I'm able to reliably reproduce both on WiredTiger and mmapv1. |
| Comments |
| Comment by Githook User [ 29/Feb/16 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Author: {u'username': u'visemet', u'name': u'Max Hirschhorn', u'email': u'max.hirschhorn@mongodb.com'}Message: This fixes an issue where the DBDirectClient used by the $lookup stage Calling PipelineProxyStage::doDetachFromOperationContext() now causes (cherry picked from commit e1928f36d21a4193802d7fbdcb8fcd6df58f7aa7) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Githook User [ 29/Feb/16 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Author: {u'username': u'visemet', u'name': u'Max Hirschhorn', u'email': u'max.hirschhorn@mongodb.com'}Message: This fixes an issue where the DBDirectClient used by the $lookup stage Calling PipelineProxyStage::doDetachFromOperationContext() now causes | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Max Hirschhorn [ 29/Feb/16 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Requesting backport to the 3.2 branch only because no stage other than $lookup would trigger a getMore on its DBDirectClient, so this bug won't ever manifest on the 3.0 branch or earlier. In particular, DocumentSourceGeoNear runs the "geoNear" command using its DBDirectClient, which always returns a single batch of results. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Max Hirschhorn [ 23/Feb/16 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I ran the aggregation query on the test.attendees collection as described by the steps in the ticket and was able to get more information from ASan. charlie.swanson and I looked through the ASan output and were able to piece together what had happened. The issue is that PipelineProxyStage::doReattachToOperationContext() doesn't necessarily change the operation context of the DBDirectClient underlying the $lookup stage. Currently this is only done by calling MongodImplementation::directClient(). The way that this can manifest is if you need to do a getMore on the DBDirectClient for the same input document to $lookup (i.e. having many documents in the from collection that match). The proposal is to add a Pipeline::setOpCtx() that will
|