[SERVER-6354] mongos all memory Created: 09/Jul/12 Updated: 15/Feb/13 Resolved: 21/Aug/12 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | 2.1.2 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Critical - P2 |
| Reporter: | Azat Khuzhin | Assignee: | Ben Becker |
| Resolution: | Duplicate | Votes: | 2 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
"version" : "2.1.2-pre-", |
||
| Attachments: |
|
||||||||||||
| Issue Links: |
|
||||||||||||
| Operating System: | ALL | ||||||||||||
| Participants: | |||||||||||||
| Description |
|
mongos process eat all memory (94%)
While on docs.mongodb.org I read that it a process that don't need many memory Open files
Sockets:
"mongod" instance is down because of this, that locate at this host too
Swap is also already full
And as shows 'free -m', cached or buffers are very small. |
| Comments |
| Comment by Ben Becker [ 21/Aug/12 ] | ||||||||||
|
This issue outlines a few possible memory leaks in various versions of mongos. I'd like to close this one out as a duplicate of the following two issues:
@Damon, I'm not sure if either of these will be helpful for your case though. Would it be possible to upgrade to v2.0 or v2.2? | ||||||||||
| Comment by Azat Khuzhin [ 21/Aug/12 ] | ||||||||||
|
I don't use authentication Use next versions: | ||||||||||
| Comment by Ben Becker [ 21/Aug/12 ] | ||||||||||
|
For those experiencing this issue in v2.1+, are you running with authentication enabled? | ||||||||||
| Comment by Azat Khuzhin [ 23/Jul/12 ] | ||||||||||
|
Are there any improvements? | ||||||||||
| Comment by Azat Khuzhin [ 17/Jul/12 ] | ||||||||||
|
Also maybe 'cursor' must be deleted at 'ParallelConnectionState::~ParallelConnectionState()' ? | ||||||||||
| Comment by Azat Khuzhin [ 17/Jul/12 ] | ||||||||||
|
I suppose that 'cursor' must be deleted at the end of src/mongo/s/strategy_shard.cpp:queryOp(Request& r) ? | ||||||||||
| Comment by Andy Schwerin [ 13/Jul/12 ] | ||||||||||
|
@daman, try setting MALLOC_ARENA_MAX to 8 in the environment before running mongos, per my comment above.
And let us know if this affects your virtual memory usage. Thanks! | ||||||||||
| Comment by Damon Cortesi [ 13/Jul/12 ] | ||||||||||
|
pmap of mongos process using 12gb virtual and 6gb resident | ||||||||||
| Comment by Nick Brown [ 12/Jul/12 ] | ||||||||||
|
Scott, I cannot re-run the process at the moment to reproduce the behavior because of looming product demonstrations. But, once it is safe for me to do so, I will get you the pmap. | ||||||||||
| Comment by Scott Hernandez (Inactive) [ 12/Jul/12 ] | ||||||||||
|
Damon, yes it is safe to do on a production system. It just creates a text file of the memory usage of the process. | ||||||||||
| Comment by Andy Schwerin [ 12/Jul/12 ] | ||||||||||
|
I believe that this behavior is actually a bad interaction with the glibc arena-based malloc implementation. The basic problem is that mongos creates a lot of threads, and the newer glibc allocators create a number of arenas proportional to some function of the number of threads and number of hardware execution contexts (cores, hardware threads, processors, whatever). The areans are created lazily, and each one is initially 64MB in size. Your pmaps file indicates that you probably got 1.8GB worth of them. To test this theory, you'd like to constrain the number of arenas that malloc will create, or try another allocation library (like tcmalloc), and see if the problem goes away. In previous encounters with this problem, switching to tcmalloc or controlling the number of arenas has worked for us, internally. We're planning to move to tcmalloc, jemalloc or some other allocator in 2.3. Could you try setting the environment variable MALLOC_ARENA_MAX to 8, and running mongos, and see if the behavior improves? 8 is the minimum useful value (it won't go lower). For production use, I don't know what number of arenas to advise. I'd probably rather recommend a switch to tcmalloc, which I believe can be achieved by setting LD_PRELOAD to pre-load the tcmalloc library at startup. | ||||||||||
| Comment by Damon Cortesi [ 12/Jul/12 ] | ||||||||||
|
Scott: Naive question, is it safe to run that in production? | ||||||||||
| Comment by Scott Hernandez (Inactive) [ 12/Jul/12 ] | ||||||||||
|
Nick, Damon, can you also provide pmaps? | ||||||||||
| Comment by Azat Khuzhin [ 12/Jul/12 ] | ||||||||||
|
Is anybody working on this? | ||||||||||
| Comment by Nick Brown [ 11/Jul/12 ] | ||||||||||
|
I have a similar problem with mongos 2.1.0. My process in in a development environment, with a cursor against one collection with relatively large documents, aggregating data into a few dozen other collections. The process is also running on the same machine, as is the mongod instance it's using. Mongos is consuming all the memory that my process and mongod are not. Before reducing the throughput of the process, the problem would cause the machine to start dropping processes. As it is now, it just gets slower and slower until the process finishes. The problem does not happen if I connect to mongod directly. insert query update delete getmore command vsize res faults netIn netOut conn repl time PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND | ||||||||||
| Comment by Damon Cortesi [ 10/Jul/12 ] | ||||||||||
|
Just wanted to chime in on this bug, I've seen pretty severe memory leaks with mongos 1.8.4. Attached is memory on the (dedicated) mongos instance over the past month. Each drop off is when I have to manually restart mongos. | ||||||||||
| Comment by Azat Khuzhin [ 10/Jul/12 ] | ||||||||||
|
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND pmap attach in a minute | ||||||||||
| Comment by Scott Hernandez (Inactive) [ 10/Jul/12 ] | ||||||||||
|
Okay, can you provide pmap data for the mongos process when it is use more than a few hundred MB of memory? | ||||||||||
| Comment by Azat Khuzhin [ 10/Jul/12 ] | ||||||||||
|
With v2.1.2 the same behavior
| ||||||||||
| Comment by Azat Khuzhin [ 10/Jul/12 ] | ||||||||||
|
I understand that this builds have bugs, okay I'll try, and post results here. Sorry, but I can't include this host into mms.
| ||||||||||
| Comment by Scott Hernandez (Inactive) [ 09/Jul/12 ] | ||||||||||
|
Can you run the 2.1.2 build instead of that nightly? There are some bugs in non-release dev builds, as well the "releases" since they are for testing/development. Is this host monitored in MMS, and if so what is the group name? If you are not monitoring using MMS would you mind for debugging this issue? http://mms.10gen.com/ That mongostat output, was that taken after you restarted the mongos instance? Can you provide some updated mongostat numbers? | ||||||||||
| Comment by Azat Khuzhin [ 09/Jul/12 ] | ||||||||||
|
On second graph, free memory in % - blue line | ||||||||||
| Comment by Azat Khuzhin [ 09/Jul/12 ] | ||||||||||
|
Also I run mapreduce, 4-6 jobs per day | ||||||||||
| Comment by Azat Khuzhin [ 09/Jul/12 ] | ||||||||||
|
I don't have more machines. Attach second graph. I'm using development version because I need some features from it. It's a production server, that handle ~90 qps
Also it's interesting why insert=query=update | ||||||||||
| Comment by Scott Hernandez (Inactive) [ 09/Jul/12 ] | ||||||||||
|
Are you running anything else on this host? When you restart the process what does memory look like initially? What is this mongos instance used for? Is this a test environment, since you are using a development version of mongodb, and if so what kinds of tests are you running? | ||||||||||
| Comment by Azat Khuzhin [ 09/Jul/12 ] | ||||||||||
|
Before this time, I don't use mongos in production, and servers where mongodb is running, don't use swap at all. I'v also attach graphs, where Swap size: 30516 mb | ||||||||||
| Comment by Azat Khuzhin [ 09/Jul/12 ] | ||||||||||
Resident memory = 43gb (it's very huge) |