[SERVER-8456] Mongod memory leak during MapReduce in 2.2.x Created: 06/Feb/13 Updated: 15/Nov/21 Resolved: 01/Apr/13 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | MapReduce |
| Affects Version/s: | 2.2.2 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Dan Cooper | Assignee: | Tad Marshall |
| Resolution: | Incomplete | Votes: | 0 |
| Labels: | memory-leak | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
CentOS 6.2 |
||
| Attachments: |
|
||||||||||||||||
| Issue Links: |
|
||||||||||||||||
| Operating System: | Linux | ||||||||||||||||
| Participants: | |||||||||||||||||
| Description |
|
During map reduce jobs and index drop/creates we are seeing memory consistantly increase over a period of time until mongod finally runs out and kills the process with kernel: Out of memory: Kill process 13794 (mongod) score 908 or sacrifice child. We have not had these issues prior to 2.2.x |
| Comments |
| Comment by Tad Marshall [ 27/Mar/13 ] |
|
So far, we have not been able to reproduce the memory leak internally; there may be details in the specific usage of MapReduce that need to be investigated to resolve this. We'll switch to a private ticket to try to learn what specifics in this case may be at the root of the memory leak. If anyone reading this has a reproducible case that they can share with us, please post the details here. |
| Comment by Dan Cooper [ 26/Mar/13 ] |
|
Tad - Yes the JS file is what I uploaded, that's what seems to be causing the leak from what my developers tell me. I can ask if we can do a mongodump to repro the data. I'm actually in your building for training today, i'm in the back of the training room if you want to discuss in person. Dan |
| Comment by Tad Marshall [ 26/Mar/13 ] |
|
Hi Dan, Sorry this has dragged on so long. There are some comments early in this ticket about uploading files with scp, but no notes saying what was uploaded. I found a JavaScript file ... is that all we have to go on? We can change this to a private ticket if you prefer, but our attempts to reproduce your problem with dummy data have not succeeded. I think we need real data to dupe and debug this. Tad P.S. Version 2.2.4-rc0 will be out soon, and it includes a bug fix that seems too small to account for the leaks you are seeing, but you could test it and let us know. |
| Comment by Dan Cooper [ 26/Mar/13 ] |
|
Do you need anything from me for more data points? Our mongod's fail daily now and we had to write a puppet script to restart them automatically. Clearly this does not show confidence in the product to the team. |
| Comment by auto [ 21/Mar/13 ] |
|
Author: {u'date': u'2013-02-12T20:32:24Z', u'name': u'Ben Becker', u'email': u'ben.becker@10gen.com'}Message: |
| Comment by Tad Marshall [ 20/Mar/13 ] |
|
Switching from SpiderMonkey in version 2.2 to V8 in version 2.4 changed the leak situation in both good and bad ways. We've been able to fix many of the new sources in our V8 code, but the work is ongoing. We don't at this point have concrete advice on what you should change in your MapReduce code to prevent or eliminate the leaks, but if you wanted to try version 2.4.0 we would be very interested in what you learn. We've fixed the cases that we've been able to reliably reproduce, but we don't think that we've fixed everything yet, so more data points would be helpful. |
| Comment by Dan Cooper [ 19/Mar/13 ] |
|
Hey there, was this fixed in 2.4? Never got an answer to my previous question. |
| Comment by Dan Cooper [ 02/Mar/13 ] |
|
Hi guys, could we find out what about our map/reduce jobs is causing the leak so maybe we can fix it? |
| Comment by Dan Cooper [ 21/Feb/13 ] |
|
Yes the only server process running on those boxes are mongod. We run mongos on all the client nodes calling the mongod's. |
| Comment by Ben Becker [ 21/Feb/13 ] |
|
Hi Dan, Just a quick update. I've moved the slow ClientCursor leak issue into I also wanted to ask – in the attached graph, is the only process running mongod? Do you run any mongos instances on the same node? Thanks, |
| Comment by auto [ 12/Feb/13 ] |
|
Author: {u'date': u'2013-02-12T20:32:24Z', u'name': u'Ben Becker', u'email': u'ben.becker@10gen.com'}Message: |
| Comment by Ben Becker [ 12/Feb/13 ] |
|
Hi Dan, We're still diagnosing the leak you reported. The patch that I intended to mark for backport does fix a (very) slow cursor leak, but I do not believe this will fix the issue you reported. Apologies for any confusion – we'll keep you posted as we make progress. Best, |
| Comment by Dan Cooper [ 11/Feb/13 ] |
|
I noticed you updated this to have a backport, do you know when that will be available for 2.2.2? |
| Comment by Dan Cooper [ 09/Feb/13 ] |
|
1. 2.2.2 though we have seen this in 2.2.0 |
| Comment by Ben Becker [ 07/Feb/13 ] |
|
Hi Dan, Just a few questions that will help track this down:
I would be happy to supply a private SCP server to upload the scripts if there are any privacy concerns. Otherwise please feel free to attach them to this ticket. Thanks! |