[SERVER-39391] Segmentation fault when performing aggregation with $lookup.let and $lookup.pipeline Created: 06/Feb/19 Updated: 28/Jul/21 Resolved: 06/May/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Aggregation Framework |
| Affects Version/s: | 4.0.6 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Hubert Bielenia | Assignee: | Danny Hatcher (Inactive) |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Operating System: | ALL |
| Participants: |
| Description |
|
Hello I encounter a segmentation fault problem with MongoDB 4.0.6 when executing one particular aggregation. MongoDB instance is deployed in Docker 18.09.1, running on Ubuntu 18.04.1, running on EC2 t3.xlarge instance. I was unable to reproduce the error in any other environment, however, it's often reproducible within these specifications. I've nailed the issue to this set of example data and aggregation query: db.actions
db.campaigns
The query
db.data collection exists, but is empty. Adding some records to it didn't affect the outcome. The query above results either in this error:
Or in this segfault: Â
 The issue appears to be caused by this expression in $lookup.pipeline: Â
 Removing it makes the output appear: Â
 As you can probably tell, the $match expression doesn't change anything here, since the data collection is empty. However, it was added before to address performance when data collection exists and contains big amount of documents and I would like to leave it, especially given the fact that it looks totally okay given the documentation: https://docs.mongodb.com/manual/reference/operator/aggregation/lookup/#specify-multiple-join-conditions-with-lookup |
| Comments |
| Comment by HAO JIE ELVIS JIAO [ 28/Jul/21 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
hi hbielenia, we meet the same error, could you let me know the root cause of this issue, because of data? it may very useful for us, thanks. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Danny Hatcher (Inactive) [ 25/Apr/19 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
hbielenia are you still experiencing this issue? | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Danny Hatcher (Inactive) [ 28/Feb/19 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Hello Hubert, After speaking with some of my colleagues, we are not aware of anything that could be triggering this problem. Have you had success in speaking with Docker? Thank you, Danny | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Hubert Bielenia [ 16/Feb/19 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Hi, sorry for the delay. This week, I did more attempts at reproducing the issue, including a try with the same environment (matching Docker and Ubuntu versions). Unfortunately, none of them was successful. I determined only one additional piece of information: recreating the container and wiping data makes issues go away, until the server is restarted, at which point only container is recreated but data stays the same. Then, the query in description causes errors and segfaults again. I thought this may indicate some kind of data corruption taking place, but validation gave no results:
To me, it would indicate that problem lies with Docker runtime and not Mongo, but I need to point out that Mongo is the only service from several others that misbehaves like that. Nonetheless, if your technical expertise would suggest so, I'll go and file this with Docker guys and see if they can help. You also requested mongod logs - the backtrace in description is taken from there, If you want full then here's an example from recent run:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Hubert Bielenia [ 08/Feb/19 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Hi Danny, thanks for your help. By "often reproducible" I meant that this query have sometimes resulted in BufBuilder error, and sometimes in segfault. These seem related, but the report itself is about the segfault, hence my wording. Indeed, every attempt to execute this query resulted in one of the above. I'll provide you with the requested information next week. As to the reproducibility, I haven't tested with another Ubuntu 18.04 environment - I'll try that as well. Thanks again! | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Danny Hatcher (Inactive) [ 08/Feb/19 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Hello Hubert, I've tried using your provided documents and query on Ubuntu 18.04 but I'm unable to replicate your issue. You mention that you have been unable to cause the problem in other environments; does that mean you are unable to reproduce in other Ubuntu 18.04 environments as well? You also mention that it's "often reproducible" on that specific environment that sees the problem; how often is often? Could you reproduce the issue once again and provide the mongod log files from the server that fails? Thank you, Danny |