[SERVER-29738] OOM Killer killed Mongodb Primary Created: 20/Jun/17  Updated: 29/Jul/17  Resolved: 27/Jun/17

Status: Closed
Project: Core Server
Component/s: Stability
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Abhishek Shukla Assignee: Mark Agarunov
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File diagnostic.data.tar.gz     File mongod.log    
Operating System: ALL
Participants:

 Description   

Jun 19 13:21:14 mongo01 kernel: vim invoked oom-killer: gfp_mask=0x280da, order=0, oom_score_adj=0
Jun 19 13:21:15 mongo01 kernel: vim cpuset=/ mems_allowed=0
Jun 19 13:21:15 mongo01 kernel: CPU: 0 PID: 538 Comm: vim Not tainted 3.10.0-327.36.3.el7.x86_64 #1
Jun 19 13:21:15 mongo01 kernel: Hardware name: Xen HVM domU, BIOS 4.2.amazon 11/11/2016
Jun 19 13:21:15 mongo01 kernel: ffff8801cdfb8b80 00000000d99d6248 ffff88005fee37b8 ffffffff81636431
Jun 19 13:21:15 mongo01 kernel: ffff88005fee3848 ffffffff816313cc ffff8800343f3130 ffff8800343f3148
Jun 19 13:21:15 mongo01 kernel: 0000000000000206 ffff8801cdfb8b80 ffff88005fee3830 ffffffff81128cef
Jun 19 13:21:15 mongo01 kernel: Call Trace:
Jun 19 13:21:15 mongo01 kernel: [<ffffffff81636431>] dump_stack+0x19/0x1b
Jun 19 13:21:15 mongo01 kernel: [<ffffffff816313cc>] dump_header+0x8e/0x214
Jun 19 13:21:15 mongo01 kernel: [<ffffffff81128cef>] ? delayacct_end+0x8f/0xb0
Jun 19 13:21:15 mongo01 kernel: [<ffffffff8116d21e>] oom_kill_process+0x24e/0x3b0
Jun 19 13:21:15 mongo01 kernel: [<ffffffff8116cd86>] ? find_lock_task_mm+0x56/0xc0
Jun 19 13:21:15 mongo01 kernel: [<ffffffff81088e4e>] ? has_capability_noaudit+0x1e/0x30
Jun 19 13:21:15 mongo01 kernel: [<ffffffff8116da46>] out_of_memory+0x4b6/0x4f0
Jun 19 13:21:15 mongo01 kernel: [<ffffffff81173c36>] __alloc_pages_nodemask+0xaa6/0xba0
Jun 19 13:21:15 mongo01 kernel: [<ffffffff811b7fca>] alloc_pages_vma+0x9a/0x150
Jun 19 13:21:15 mongo01 kernel: [<ffffffff81197b75>] handle_mm_fault+0xba5/0xf80
Jun 19 13:21:15 mongo01 kernel: [<ffffffff81642040>] __do_page_fault+0x150/0x450
Jun 19 13:21:15 mongo01 kernel: [<ffffffff810d82ec>] ? ktime_get_ts64+0x4c/0xf0
Jun 19 13:21:15 mongo01 kernel: [<ffffffff81642363>] do_page_fault+0x23/0x80
Jun 19 13:21:15 mongo01 kernel: [<ffffffff8163e648>] page_fault+0x28/0x30
Jun 19 13:21:15 mongo01 kernel: [<ffffffff81168977>] ? file_read_actor+0xd7/0x180
Jun 19 13:21:15 mongo01 kernel: [<ffffffff8116b0a8>] generic_file_aio_read+0x478/0x750
Jun 19 13:21:15 mongo01 kernel: [<ffffffffa012ced1>] xfs_file_aio_read+0x151/0x2f0 [xfs]
Jun 19 13:21:15 mongo01 kernel: [<ffffffff811de47d>] do_sync_read+0x8d/0xd0
Jun 19 13:21:15 mongo01 kernel: [<ffffffff811debdc>] vfs_read+0x9c/0x170
Jun 19 13:21:15 mongo01 kernel: [<ffffffff811df72f>] SyS_read+0x7f/0xe0
Jun 19 13:21:15 mongo01 kernel: [<ffffffff81646b49>] system_call_fastpath+0x16/0x1b
Jun 19 13:21:15 mongo01 kernel: Mem-Info:
Jun 19 13:21:15 mongo01 kernel: Node 0 DMA per-cpu:
Jun 19 13:21:15 mongo01 kernel: CPU    0: hi:    0, btch:   1 usd:   0
Jun 19 13:21:15 mongo01 kernel: CPU    1: hi:    0, btch:   1 usd:   0
Jun 19 13:21:15 mongo01 kernel: Node 0 DMA32 per-cpu:
Jun 19 13:21:15 mongo01 kernel: CPU    0: hi:  186, btch:  31 usd:  19
Jun 19 13:21:15 mongo01 kernel: CPU    1: hi:  186, btch:  31 usd: 169
Jun 19 13:21:15 mongo01 kernel: Node 0 Normal per-cpu:
Jun 19 13:21:15 mongo01 kernel: CPU    0: hi:  186, btch:  31 usd:  32
Jun 19 13:21:15 mongo01 kernel: CPU    1: hi:  186, btch:  31 usd: 161
Jun 19 13:21:15 mongo01 kernel: active_anon:1687527 inactive_anon:37500 isolated_anon:0#012 active_file:1256 inactive_file:2152 isolated_file:0#012 unevictable:0 dirty:0 writeback:0 unstable:0#012 free:31867 slab_reclaimable:19575 slab_unreclaimable:4568#012 mapped:1559 shmem:90177 pagetables:5526 bounce:0#012 free_cma:0
Jun 19 13:21:15 mongo01 kernel: Node 0 DMA free:15904kB min:140kB low:172kB high:208kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15988kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
Jun 19 13:21:15 mongo01 kernel: lowmem_reserve[]: 0 3583 7300 7300
Jun 19 13:21:15 mongo01 kernel: Node 0 DMA32 free:71252kB min:33104kB low:41380kB high:49656kB active_anon:3443088kB inactive_anon:78380kB active_file:2840kB inactive_file:2716kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3915776kB managed:3671400kB mlocked:0kB dirty:0kB writeback:0kB mapped:4664kB shmem:192548kB slab_reclaimable:46616kB slab_unreclaimable:5332kB kernel_stack:2832kB pagetables:9672kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
Jun 19 13:21:15 mongo01 kernel: lowmem_reserve[]: 0 0 3717 3717
Jun 19 13:21:15 mongo01 kernel: Node 0 Normal free:42912kB min:34336kB low:42920kB high:51504kB active_anon:3307020kB inactive_anon:71620kB active_file:2384kB inactive_file:9092kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3932160kB managed:3806392kB mlocked:0kB dirty:0kB writeback:0kB mapped:1572kB shmem:168160kB slab_reclaimable:31684kB slab_unreclaimable:12940kB kernel_stack:3344kB pagetables:12432kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:12157 all_unreclaimable? yes
Jun 19 13:21:15 mongo01 kernel: lowmem_reserve[]: 0 0 0 0
Jun 19 13:21:15 mongo01 kernel: Node 0 DMA: 0*4kB 0*8kB 0*16kB 1*32kB (U) 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (R) 3*4096kB (M) = 15904kB
Jun 19 13:21:15 mongo01 kernel: Node 0 DMA32: 256*4kB (UE) 169*8kB (UE) 690*16kB (UEM) 404*32kB (EM) 198*64kB (UEM) 82*128kB (UEM) 10*256kB (UEM) 0*512kB 0*1024kB 0*2048kB 1*4096kB (R) = 56168kB
Jun 19 13:21:15 mongo01 kernel: Node 0 Normal: 280*4kB (UEM) 214*8kB (UE) 533*16kB (UEM) 321*32kB (UEM) 120*64kB (UEM) 25*128kB (UEM) 0*256kB 0*512kB 0*1024kB 0*2048kB 1*4096kB (R) = 36608kB
Jun 19 13:21:15 mongo01 kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Jun 19 13:21:15 mongo01 kernel: 94731 total pagecache pages
Jun 19 13:21:15 mongo01 kernel: 0 pages in swap cache
Jun 19 13:21:15 mongo01 kernel: Swap cache stats: add 0, delete 0, find 0/0
Jun 19 13:21:15 mongo01 kernel: Free swap  = 0kB
Jun 19 13:21:15 mongo01 kernel: Total swap = 0kB
Jun 19 13:21:15 mongo01 kernel: 1965981 pages RAM
Jun 19 13:21:15 mongo01 kernel: 0 pages HighMem/MovableOnly
Jun 19 13:21:15 mongo01 kernel: 92557 pages reserved
Jun 19 13:21:15 mongo01 kernel: [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
Jun 19 13:21:15 mongo01 kernel: [  391]     0   391     9485      552      22        0             0 systemd-journal
Jun 19 13:21:15 mongo01 kernel: [  423]     0   423    10721      148      20        0         -1000 systemd-udevd
Jun 19 13:21:15 mongo01 kernel: [  438]     0   438    29185      113      27        0         -1000 auditd
Jun 19 13:21:15 mongo01 kernel: [  474]    81   474     6649       97      18        0          -900 dbus-daemon
Jun 19 13:21:15 mongo01 kernel: [  480]   996   480    28961      103      26        0             0 chronyd
Jun 19 13:21:15 mongo01 kernel: [  493]     0   493     4795       65      13        0             0 irqbalance
Jun 19 13:21:15 mongo01 kernel: [  496]     0   496     6600       83      15        0             0 systemd-logind
Jun 19 13:21:15 mongo01 kernel: [  499]     0   499   141561      761     126        0             0 rsyslogd
Jun 19 13:21:15 mongo01 kernel: [  506]     0   506    50842      123      40        0             0 gssproxy
Jun 19 13:21:15 mongo01 kernel: [  535]     0   535    31584      156      18        0             0 crond
Jun 19 13:21:15 mongo01 kernel: [  540]     0   540    27509       32      11        0             0 agetty
Jun 19 13:21:15 mongo01 kernel: [  544]     0   544    27509       33      10        0             0 agetty
Jun 19 13:21:15 mongo01 kernel: [  741]     0   741    27632     3121      51        0             0 dhclient
Jun 19 13:21:15 mongo01 kernel: [  787]     0   787   138267     2650      88        0             0 tuned
Jun 19 13:21:15 mongo01 kernel: [  788]     0   788    20640      219      43        0         -1000 sshd
Jun 19 13:21:15 mongo01 kernel: [ 1338]     0  1338    22785      262      43        0             0 master
Jun 19 13:21:15 mongo01 kernel: [ 1346]    89  1346    22828      252      45        0             0 qmgr
Jun 19 13:21:15 mongo01 kernel: [ 2221]   997  2221   131891      902      52        0             0 polkitd
Jun 19 13:21:15 mongo01 kernel: [29244]     0 29244    55763     1114      64        0             0 snmpd
Jun 19 13:21:15 mongo01 kernel: [27472]   995 27472  1623551   819093    2930        0             0 mongod
Jun 19 13:21:15 mongo01 kernel: [  328]    89   328    22811      251      44        0             0 pickup
Jun 19 13:21:15 mongo01 kernel: [  473]     0   473    35210      317      72        0             0 sshd
Jun 19 13:21:15 mongo01 kernel: [  476]  1000   476    35210      311      70        0             0 sshd
Jun 19 13:21:15 mongo01 kernel: [  477]  1000   477    28846      107      13        0             0 bash
Jun 19 13:21:15 mongo01 kernel: [  537]     0   537    48358      177      50        0             0 sudo
Jun 19 13:21:15 mongo01 kernel: [  538]     0   538   838533   800774    1592        0             0 vim
Jun 19 13:21:15 mongo01 kernel: Out of memory: Kill process 27472 (mongod) score 454 or sacrifice child
Jun 19 13:21:15 mongo01 kernel: Killed process 27472 (mongod) total-vm:6494204kB, anon-rss:3276400kB, file-rss:0kB



 Comments   
Comment by Mark Agarunov [ 27/Jun/17 ]

Hello shukla,

Thank you for providing this data. Looking over the output and data, it appears that the out of memory condition was not caused directly by mongod, but instead by a vim processes using a large amount of memory:

Jun 19 13:21:15 mongo01 kernel: [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
Jun 19 13:21:15 mongo01 kernel: [27472]   995 27472  1623551   819093    2930        0             0 mongod
Jun 19 13:21:15 mongo01 kernel: [  538]     0   538   838533   800774    1592        0             0 vim

This appears to have been caused by the system simply running out of memory, and not anything specific or related no mongod itself. I do not see anything to indicate a bug in the MongoDB server. For MongoDB-related support discussion please post on the mongodb-user group or Stack Overflow with the mongodb tag. A question like this involving more discussion would be best posted on the mongodb-user group.

Thanks,
Mark

Comment by Abhishek Shukla [ 27/Jun/17 ]

Can I diagnose anything more to help you solve this problem ?

Comment by Abhishek Shukla [ 21/Jun/17 ]

Thanks a lot for looking into it. I have attached the files in the ticket. Please let me know if you need more help on it.

Comment by Kelsey Schubert [ 20/Jun/17 ]

Hi shukla,

Thank you for the report. So we can continue to investigate, would you please upload an archive of the diagnostic.data directory and logs from the affected mongod?

Kind regards,
Thomas

Generated at Thu Feb 08 04:21:41 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.