[SERVER-6354] mongos all memory Created: 09/Jul/12  Updated: 15/Feb/13  Resolved: 21/Aug/12

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 2.1.2
Fix Version/s: None

Type: Bug Priority: Critical - P2
Reporter: Azat Khuzhin Assignee: Ben Becker
Resolution: Duplicate Votes: 2
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

"version" : "2.1.2-pre-",
"gitVersion" : "d15b244ccabe1d8d0d1e93f2cb8de03d6b1d09d8",


Attachments: PNG File mongo_mem-month.png     File mongos-23012_-x.pmap     PNG File mongos-instance-memory-after-restart.png     PNG File mongos-instance-memory1.png     File mongos.pmap     File mongos_25335_20120710.pmap    
Issue Links:
Duplicate
is duplicated by SERVER-6785 Possible mongos memory leak Closed
Related
Operating System: ALL
Participants:

 Description   

mongos process eat all memory (94%)

# cat /proc/12220/status
Name:   mongos
State:  S (sleeping)
Tgid:   12220
Pid:    12220
PPid:   1
TracerPid:      0
Uid:    113     113     113     113
Gid:    65534   65534   65534   65534
FDSize: 1024
Groups: 120 65534 
VmPeak: 79183616 kB
VmSize: 78010392 kB
VmLck:         0 kB
VmHWM:  46019552 kB
VmRSS:  46019552 kB
VmData: 77986428 kB
VmStk:        88 kB
VmExe:      5980 kB
VmLib:      3344 kB
VmPTE:    151204 kB
Threads:        82
SigQ:   0/16382
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000000000
SigIgn: 0000000000000000
SigCgt: 00000001800046e7
CapInh: 0000000000000000
CapPrm: 0000000000000000
CapEff: 0000000000000000
CapBnd: ffffffffffffffff
Cpus_allowed:   ffffff
Cpus_allowed_list:      0-23
Mems_allowed:   00000000,00000003
Mems_allowed_list:      0-1
voluntary_ctxt_switches:        32887795
nonvoluntary_ctxt_switches:     1434

# ps u 12220
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
mongodb  12220  2.6 92.8 78010920 46027844 ?   Sl   Jul05 147:46 /usr/bin/mongos --config /etc/mongos.conf --pidfilepath /work/mongo/mongos.pid

While on docs.mongodb.org I read that it a process that don't need many memory

Open files

# ls -l /proc/12220/fd | fgrep -v socket
total 0
lrwx------ 1 mongodb nogroup 64 Jul  9 08:24 0 -> /dev/null
l-wx------ 1 mongodb nogroup 64 Jul  9 08:24 1 -> /var/log/mongodb/mongos.log
l-wx------ 1 mongodb nogroup 64 Jul  9 08:24 2 -> /var/log/mongodb/mongos.log
lr-x------ 1 mongodb nogroup 64 Jul  9 08:24 3 -> /dev/urandom

Sockets:

# ls -l /proc/12220/fd | fgrep socket | wc -l
318

"mongod" instance is down because of this, that locate at this host too

[1209528.992738] active_anon:11272433 inactive_anon:867823 isolated_anon:0
[1209528.992739]  active_file:465 inactive_file:607 isolated_file:96
[1209528.992740]  unevictable:0 dirty:8 writeback:0 unstable:0
[1209528.992741]  free:35619 slab_reclaimable:11632 slab_unreclaimable:46570
[1209528.992741]  mapped:2193 shmem:1089 pagetables:103091 bounce:0
[1209528.992743] Node 0 DMA free:15736kB min:8kB low:8kB high:12kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15152kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[1209528.992751] lowmem_reserve[]: 0 2991 24201 24201
[1209528.992753] Node 0 DMA32 free:90612kB min:1736kB low:2168kB high:2604kB active_anon:1954904kB inactive_anon:498024kB active_file:428kB inactive_file:508kB unevictable:0kB isolated(anon):0kB isolated(file):128kB present:3063520kB mlocked:0kB dirty:4kB writeback:0kB mapped:2084kB shmem:56kB slab_reclaimable:7820kB slab_unreclaimable:93904kB kernel_stack:5368kB pagetables:21756kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[1209528.992762] lowmem_reserve[]: 0 0 21210 21210
[1209528.992764] Node 0 Normal free:14140kB min:12332kB low:15412kB high:18496kB active_anon:20224444kB inactive_anon:1445744kB active_file:476kB inactive_file:908kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:21719040kB mlocked:0kB dirty:0kB writeback:0kB mapped:2400kB shmem:1668kB slab_reclaimable:20684kB slab_unreclaimable:36336kB kernel_stack:3416kB pagetables:200632kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[1209528.992773] lowmem_reserve[]: 0 0 0 0
[1209528.992775] Node 1 Normal free:22748kB min:14092kB low:17612kB high:21136kB active_anon:22910384kB inactive_anon:1527524kB active_file:956kB inactive_file:1012kB unevictable:0kB isolated(anon):0kB isolated(file):256kB present:24821760kB mlocked:0kB dirty:28kB writeback:0kB mapped:4288kB shmem:2632kB slab_reclaimable:18024kB slab_unreclaimable:56040kB kernel_stack:4688kB pagetables:189976kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:32 all_unreclaimable? no
[1209528.992783] lowmem_reserve[]: 0 0 0 0
[1209528.992785] Node 0 DMA: 0*4kB 1*8kB 1*16kB 1*32kB 1*64kB 0*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15736kB
[1209528.992792] Node 0 DMA32: 14434*4kB 4164*8kB 34*16kB 1*32kB 1*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 91688kB
[1209528.992798] Node 0 Normal: 3459*4kB 6*8kB 0*16kB 2*32kB 3*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 14140kB
[1209528.992804] Node 1 Normal: 5917*4kB 47*8kB 2*16kB 1*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 24108kB
[1209528.992810] 17720 total pagecache pages
[1209528.992811] 14833 pages in swap cache
[1209528.992813] Swap cache stats: add 8503273, delete 8488440, find 2030254/2098475
[1209528.992814] Free swap  = 0kB
[1209528.992815] Total swap = 31249400kB
[1209529.149310] 12582896 pages RAM
[1209529.149312] 193293 pages reserved
[1209529.149314] 346058 pages shared
[1209529.149315] 12349055 pages non-shared
[1209529.149318] Out of memory: kill process 9308 (mongod) score 459992 or a child
[1209529.149364] Killed process 9308 (mongod)

Swap is also already full

# free -m
             total       used       free     shared    buffers     cached
Mem:         48396      48074        322          0         15       2126
-/+ buffers/cache:      45932       2464
Swap:        30516      30512          4

And as shows 'free -m', cached or buffers are very small.
So maybe "mongos" have memory leak



 Comments   
Comment by Ben Becker [ 21/Aug/12 ]

This issue outlines a few possible memory leaks in various versions of mongos. I'd like to close this one out as a duplicate of the following two issues:

SERVER-6785: mongos leaks memory when accessing unsharded collections.
SERVER-6810: apparent leak related to authentication (may end up being a duplicate of SERVER-6785).

@Damon, I'm not sure if either of these will be helpful for your case though. Would it be possible to upgrade to v2.0 or v2.2?

Comment by Azat Khuzhin [ 21/Aug/12 ]

I don't use authentication

Use next versions:
2.1.2-pre-
2.1.3-pre-

Comment by Ben Becker [ 21/Aug/12 ]

For those experiencing this issue in v2.1+, are you running with authentication enabled?

Comment by Azat Khuzhin [ 23/Jul/12 ]

Are there any improvements?

Comment by Azat Khuzhin [ 17/Jul/12 ]

Also maybe 'cursor' must be deleted at 'ParallelConnectionState::~ParallelConnectionState()' ?

Comment by Azat Khuzhin [ 17/Jul/12 ]

I suppose that 'cursor' must be deleted at the end of src/mongo/s/strategy_shard.cpp:queryOp(Request& r) ?
Maybe only if it is sharded, but maybe not

Comment by Andy Schwerin [ 13/Jul/12 ]

@daman, try setting MALLOC_ARENA_MAX to 8 in the environment before running mongos, per my comment above.

env MALLOC_ARENA_MAX=8 mongos ...

And let us know if this affects your virtual memory usage. Thanks!

Comment by Damon Cortesi [ 13/Jul/12 ]

pmap of mongos process using 12gb virtual and 6gb resident

Comment by Nick Brown [ 12/Jul/12 ]

Scott, I cannot re-run the process at the moment to reproduce the behavior because of looming product demonstrations. But, once it is safe for me to do so, I will get you the pmap.

Comment by Scott Hernandez (Inactive) [ 12/Jul/12 ]

Damon, yes it is safe to do on a production system. It just creates a text file of the memory usage of the process.

Comment by Andy Schwerin [ 12/Jul/12 ]

I believe that this behavior is actually a bad interaction with the glibc arena-based malloc implementation. The basic problem is that mongos creates a lot of threads, and the newer glibc allocators create a number of arenas proportional to some function of the number of threads and number of hardware execution contexts (cores, hardware threads, processors, whatever). The areans are created lazily, and each one is initially 64MB in size. Your pmaps file indicates that you probably got 1.8GB worth of them.

To test this theory, you'd like to constrain the number of arenas that malloc will create, or try another allocation library (like tcmalloc), and see if the problem goes away. In previous encounters with this problem, switching to tcmalloc or controlling the number of arenas has worked for us, internally. We're planning to move to tcmalloc, jemalloc or some other allocator in 2.3.

Could you try setting the environment variable MALLOC_ARENA_MAX to 8, and running mongos, and see if the behavior improves? 8 is the minimum useful value (it won't go lower). For production use, I don't know what number of arenas to advise. I'd probably rather recommend a switch to tcmalloc, which I believe can be achieved by setting LD_PRELOAD to pre-load the tcmalloc library at startup.

Comment by Damon Cortesi [ 12/Jul/12 ]

Scott: Naive question, is it safe to run that in production?

Comment by Scott Hernandez (Inactive) [ 12/Jul/12 ]

Nick, Damon, can you also provide pmaps?

Comment by Azat Khuzhin [ 12/Jul/12 ]

Is anybody working on this?

Comment by Nick Brown [ 11/Jul/12 ]

I have a similar problem with mongos 2.1.0. My process in in a development environment, with a cursor against one collection with relatively large documents, aggregating data into a few dozen other collections. The process is also running on the same machine, as is the mongod instance it's using. Mongos is consuming all the memory that my process and mongod are not. Before reducing the throughput of the process, the problem would cause the machine to start dropping processes. As it is now, it just gets slower and slower until the process finishes. The problem does not happen if I connect to mongod directly.

insert query update delete getmore command vsize res faults netIn netOut conn repl time
351 335 176 175 0 1 32.3g 20g 0 85k 281k 62 RTR 15:57:53
282 289 139 143 0 13 32.3g 20g 0 61k 236k 62 RTR 15:57:54
50 44 26 24 0 1 32.3g 20g 0 13k 38k 62 RTR 15:57:55
118 117 59 59 0 1 32.3g 20g 1 27k 99k 62 RTR 15:57:56
306 309 155 151 0 13 32.3g 20g 0 83k 261k 62 RTR 15:57:57
259 255 128 131 0 1 32.3g 20g 3 61k 205k 62 RTR 15:57:58
339 339 168 171 0 13 32.3g 20g 0 79k 285k 62 RTR 15:57:59
312 316 159 153 0 1 32.3g 20g 0 61k 256k 62 RTR 15:58:00
315 316 154 161 0 1 32.3g 20g 0 62k 266k 62 RTR 15:58:01
329 332 168 161 0 13 32.3g 20g 0 68k 279k 62 RTR 15:58:02
insert query update delete getmore command vsize res faults netIn netOut conn repl time
341 341 168 173 0 1 32.3g 20g 0 66k 278k 62 RTR 15:58:03
334 333 168 166 0 13 32.3g 20g 0 63k 279k 62 RTR 15:58:04
209 210 105 104 0 1 32.3g 20g 0 39k 172k 62 RTR 15:58:05
287 282 142 145 0 1 32.3g 20g 1 53k 232k 62 RTR 15:58:06
146 149 75 71 0 13 32.3g 20g 0 29k 130k 62 RTR 15:58:07
272 269 133 139 0 1 32.3g 20g 0 51k 217k 62 RTR 15:58:08
412 416 208 204 0 13 32.3g 20g 0 88k 348k 62 RTR 15:58:09
426 425 214 212 0 1 32.3g 20g 0 85k 352k 62 RTR 15:58:10

PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
20 0 32.6g 20g 1732 S 7.6 85.8 21:22.25 mongos

Comment by Damon Cortesi [ 10/Jul/12 ]

Just wanted to chime in on this bug, I've seen pretty severe memory leaks with mongos 1.8.4.

Attached is memory on the (dedicated) mongos instance over the past month. Each drop off is when I have to manually restart mongos.

Comment by Azat Khuzhin [ 10/Jul/12 ]

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
25335 mongodb 20 0 2762m 2.3g 4476 S 3 4.8 2:58.84 /usr/bin/mongos --config /etc/mongos.conf --pidfilepath /work/mongo/mongos.pid

pmap attach in a minute

Comment by Scott Hernandez (Inactive) [ 10/Jul/12 ]

Okay, can you provide pmap data for the mongos process when it is use more than a few hundred MB of memory?

Comment by Azat Khuzhin [ 10/Jul/12 ]

With v2.1.2 the same behavior

mongos> db.version()
2.1.2

Comment by Azat Khuzhin [ 10/Jul/12 ]

I understand that this builds have bugs, okay I'll try, and post results here.

Sorry, but I can't include this host into mms.
Yes this mongostat is after restart (vsize just 802m)

$ mongostat
connected to: 127.0.0.1
insert  query update delete getmore command  vsize    res faults  netIn netOut  conn repl       time 
    89     89     89      0       0       1  18.5g  18.1g      0    40k    80k    87  RTR   04:45:10 
    87     87     87      0       0       1  18.5g  18.1g      0    39k    77k    87  RTR   04:45:11 
    72     72     72      0       0       1  18.5g  18.1g      0    32k    67k    87  RTR   04:45:12 
    76     76     76      0       0       1  18.5g  18.1g      0    34k    69k    87  RTR   04:45:13 
    60     60     60      0       0       1  18.5g  18.1g      0    27k    54k    87  RTR   04:45:14 
    55     55     55      0       0       1  18.5g  18.1g      0    25k    48k    87  RTR   04:45:15 
    88     88     88      0       0       1  18.5g  18.1g      0    40k    79k    87  RTR   04:45:16 

Comment by Scott Hernandez (Inactive) [ 09/Jul/12 ]

Can you run the 2.1.2 build instead of that nightly? There are some bugs in non-release dev builds, as well the "releases" since they are for testing/development.

Is this host monitored in MMS, and if so what is the group name? If you are not monitoring using MMS would you mind for debugging this issue? http://mms.10gen.com/

That mongostat output, was that taken after you restarted the mongos instance? Can you provide some updated mongostat numbers?

Comment by Azat Khuzhin [ 09/Jul/12 ]

On second graph, free memory in % - blue line

Comment by Azat Khuzhin [ 09/Jul/12 ]

Also I run mapreduce, 4-6 jobs per day

Comment by Azat Khuzhin [ 09/Jul/12 ]

I don't have more machines.

Attach second graph.

I'm using development version because I need some features from it. It's a production server, that handle ~90 qps

insert  query update delete getmore command  vsize    res faults  netIn netOut  conn repl       time 
    93     93     93      0       0       1   802m   409m      0    44k    92k    61  RTR   09:14:39 
    77     77     77      0       0       1   802m   410m      0    35k    69k    61  RTR   09:14:40 
   123    123    123      0       0       1   802m   410m      0    56k   113k    61  RTR   09:14:41 
   103    103    103      0       0       1   802m   411m      0    47k    95k    61  RTR   09:14:42 
    87     87     87      0       0       1   802m   411m      0    39k    80k    61  RTR   09:14:43 
   118    118    118      0       0       1   802m   411m      0    53k   109k    61  RTR   09:14:44

Also it's interesting why insert=query=update

Comment by Scott Hernandez (Inactive) [ 09/Jul/12 ]

Are you running anything else on this host?

When you restart the process what does memory look like initially?

What is this mongos instance used for? Is this a test environment, since you are using a development version of mongodb, and if so what kinds of tests are you running?

Comment by Azat Khuzhin [ 09/Jul/12 ]

Before this time, I don't use mongos in production, and servers where mongodb is running, don't use swap at all.

I'v also attach graphs, where
yellow - free swap space in % where "mongos", "mongod", "mongod --configsrv" runs - at the end 0%
red(that you can't, because dark blue is overlaps it) - free swap space in % where runs "mongod", "mongod --configsrv", "mongod --configsrv" - always ~100%

Swap size: 30516 mb

Comment by Azat Khuzhin [ 09/Jul/12 ]

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND 
12220 mongodb   20   0 74.4g  43g 2440 S    3 93.0 147:53.49 /usr/bin/mongos --config /etc/mongos.conf --pidfilepath /work/mongo/mongos.pid

Resident memory = 43gb (it's very huge)
No mapreduce jobs are runs now

Generated at Thu Feb 08 03:11:26 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.