[SERVER-20159] Out of memory on index build during initial sync even with low cacheSize parameter Created: 27/Aug/15  Updated: 30/Dec/15  Resolved: 18/Sep/15

Status: Closed
Project: Core Server
Component/s: WiredTiger
Affects Version/s: 3.0.6
Fix Version/s: 3.0.7, 3.1.9

Type: Bug Priority: Critical - P2
Reporter: Anthony Pastor Assignee: Michael Cahill (Inactive)
Resolution: Done Votes: 0
Labels: WTmem
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Debian Wheezy (7.7)


Attachments: PNG File RAM_Graph_MongoDB.png     File WiredTigerStat.11.15.gz     File mongodb_crash.tar.gz     PNG File spike.png     HTML File stats.html    
Issue Links:
Depends
is depended on by WT-1973 MongoDB changes for WiredTiger 2.7.0 Closed
Duplicate
is duplicated by SERVER-19818 Cache usage exceeds configured maximu... Closed
Related
related to SERVER-17456 Mongodb 3.0 wiredTiger storage engine... Closed
related to SERVER-20204 Segmentation fault during index build... Closed
related to SERVER-18829 Cache usage exceeds configured maximu... Closed
is related to SERVER-19818 Cache usage exceeds configured maximu... Closed
Backwards Compatibility: Fully Compatible
Operating System: Linux
Backport Completed:
Participants:

 Description   

When we try to rebuild a 3.0.6 replicaSet member the process mongod is killed by OOMKiller.

  • Reconstruction started at:
    2015/08/26-23h44: /etc/init.d/mongodb-rsm-uniq start
  • It stoped at 2015/08/27-07h21:

/var/log/syslog:

Aug 27 07:21:39 mongo-uniq1 kernel: mongod invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0
Aug 27 07:21:39 mongo-uniq1 kernel: mongod cpuset=/ mems_allowed=0
Aug 27 07:21:39 mongo-uniq1 kernel: CPU: 7 PID: 4521 Comm: mongod Not tainted 3.10.23-xxxx-std-ipv6-64 #1
Aug 27 07:21:39 mongo-uniq1 kernel: Hardware name: Supermicro X9SRH-7F/7TF/X9SRH-7F/7TF, BIOS 3.00 07/05/2013
Aug 27 07:21:39 mongo-uniq1 kernel: 0000000000000000 ffff881fc3bb1918 ffffffff81d38724 ffff881fc3bb1998
Aug 27 07:21:39 mongo-uniq1 kernel: ffffffff81d328a8 ffff881fc3bb1998 ffff881fd02936e0 ffff881fd02936f8
Aug 27 07:21:39 mongo-uniq1 kernel: 0000000000000202 ffff881f00000000 0000000000000206 ffff881fc3bb1998
Aug 27 07:21:39 mongo-uniq1 kernel: Call Trace:
Aug 27 07:21:39 mongo-uniq1 kernel: [<ffffffff81d38724>] dump_stack+0x19/0x1b
Aug 27 07:21:39 mongo-uniq1 kernel: [<ffffffff81d328a8>] dump_header+0x75/0x1b9
Aug 27 07:21:39 mongo-uniq1 kernel: [<ffffffff81602326>] ? ___ratelimit+0x96/0x110
Aug 27 07:21:39 mongo-uniq1 kernel: [<ffffffff8112641f>] oom_kill_process+0x1bf/0x310
Aug 27 07:21:39 mongo-uniq1 kernel: [<ffffffff810c7422>] ? has_ns_capability_noaudit+0x12/0x20
Aug 27 07:21:39 mongo-uniq1 kernel: [<ffffffff81126b65>] out_of_memory+0x415/0x450
Aug 27 07:21:39 mongo-uniq1 kernel: [<ffffffff8112bad5>] __alloc_pages_nodemask+0x645/0x840
Aug 27 07:21:39 mongo-uniq1 kernel: [<ffffffff81159777>] alloc_pages_current+0xb7/0x180
Aug 27 07:21:39 mongo-uniq1 kernel: [<ffffffff81122f47>] __page_cache_alloc+0xa7/0xc0
Aug 27 07:21:39 mongo-uniq1 kernel: [<ffffffff81123b12>] filemap_fault+0x292/0x460
Aug 27 07:21:39 mongo-uniq1 kernel: [<ffffffff8113f90f>] __do_fault+0x6f/0x4c0
Aug 27 07:21:39 mongo-uniq1 kernel: [<ffffffff8114243b>] handle_pte_fault+0x8b/0x9f0
Aug 27 07:21:39 mongo-uniq1 kernel: [<ffffffff81182441>] ? set_fd_set+0x21/0x30
Aug 27 07:21:39 mongo-uniq1 kernel: [<ffffffff811831ed>] ? core_sys_select+0x23d/0x2f0
Aug 27 07:21:39 mongo-uniq1 kernel: [<ffffffff81143047>] handle_mm_fault+0x147/0x1e0
Aug 27 07:21:39 mongo-uniq1 kernel: [<ffffffff8106c391>] __do_page_fault+0x151/0x4b0
Aug 27 07:21:39 mongo-uniq1 kernel: [<ffffffff810e4f45>] ? check_preempt_curr+0x75/0xa0
Aug 27 07:21:39 mongo-uniq1 kernel: [<ffffffff810f18a7>] ? ktime_get_ts+0x47/0xe0
Aug 27 07:21:39 mongo-uniq1 kernel: [<ffffffff811823cc>] ? poll_select_copy_remaining+0xec/0x140
Aug 27 07:21:39 mongo-uniq1 kernel: [<ffffffff8106c719>] do_page_fault+0x9/0x10
Aug 27 07:21:39 mongo-uniq1 kernel: [<ffffffff81d41172>] page_fault+0x22/0x30
Aug 27 07:21:39 mongo-uniq1 kernel: Mem-Info:
Aug 27 07:21:39 mongo-uniq1 kernel: Node 0 DMA per-cpu:
Aug 27 07:21:39 mongo-uniq1 kernel: CPU    0: hi:    0, btch:   1 usd:   0
Aug 27 07:21:39 mongo-uniq1 kernel: CPU    1: hi:    0, btch:   1 usd:   0
Aug 27 07:21:39 mongo-uniq1 kernel: CPU    2: hi:    0, btch:   1 usd:   0
Aug 27 07:21:39 mongo-uniq1 kernel: CPU    3: hi:    0, btch:   1 usd:   0
Aug 27 07:21:39 mongo-uniq1 kernel: CPU    4: hi:    0, btch:   1 usd:   0
Aug 27 07:21:39 mongo-uniq1 kernel: CPU    5: hi:    0, btch:   1 usd:   0
Aug 27 07:21:39 mongo-uniq1 kernel: CPU    6: hi:    0, btch:   1 usd:   0
Aug 27 07:21:39 mongo-uniq1 kernel: CPU    7: hi:    0, btch:   1 usd:   0
Aug 27 07:21:39 mongo-uniq1 kernel: CPU    8: hi:    0, btch:   1 usd:   0
Aug 27 07:21:39 mongo-uniq1 kernel: CPU    9: hi:    0, btch:   1 usd:   0
Aug 27 07:21:39 mongo-uniq1 kernel: CPU   10: hi:    0, btch:   1 usd:   0
Aug 27 07:21:39 mongo-uniq1 kernel: CPU   11: hi:    0, btch:   1 usd:   0
Aug 27 07:21:39 mongo-uniq1 kernel: CPU   12: hi:    0, btch:   1 usd:   0
Aug 27 07:21:39 mongo-uniq1 kernel: CPU   13: hi:    0, btch:   1 usd:   0
Aug 27 07:21:39 mongo-uniq1 kernel: CPU   14: hi:    0, btch:   1 usd:   0
Aug 27 07:21:39 mongo-uniq1 kernel: CPU   15: hi:    0, btch:   1 usd:   0
Aug 27 07:21:39 mongo-uniq1 kernel: Node 0 DMA32 per-cpu:
Aug 27 07:21:39 mongo-uniq1 kernel: CPU    0: hi:  186, btch:  31 usd:   0
Aug 27 07:21:39 mongo-uniq1 kernel: CPU    1: hi:  186, btch:  31 usd:   0
Aug 27 07:21:39 mongo-uniq1 kernel: CPU    2: hi:  186, btch:  31 usd:   0
Aug 27 07:21:39 mongo-uniq1 kernel: CPU    3: hi:  186, btch:  31 usd:   0
Aug 27 07:21:39 mongo-uniq1 kernel: CPU    4: hi:  186, btch:  31 usd:   0
Aug 27 07:21:39 mongo-uniq1 kernel: CPU    5: hi:  186, btch:  31 usd:   0
Aug 27 07:21:39 mongo-uniq1 kernel: CPU    6: hi:  186, btch:  31 usd:   0
Aug 27 07:21:39 mongo-uniq1 kernel: CPU    7: hi:  186, btch:  31 usd:   0
Aug 27 07:21:39 mongo-uniq1 kernel: CPU    8: hi:  186, btch:  31 usd:   0
Aug 27 07:21:39 mongo-uniq1 kernel: CPU    9: hi:  186, btch:  31 usd:   0
Aug 27 07:21:39 mongo-uniq1 kernel: CPU   10: hi:  186, btch:  31 usd:   0
Aug 27 07:21:39 mongo-uniq1 kernel: CPU   11: hi:  186, btch:  31 usd:   0
Aug 27 07:21:39 mongo-uniq1 kernel: CPU   12: hi:  186, btch:  31 usd:   0
Aug 27 07:21:39 mongo-uniq1 kernel: CPU   13: hi:  186, btch:  31 usd:   0
Aug 27 07:21:39 mongo-uniq1 kernel: CPU   14: hi:  186, btch:  31 usd:   0
Aug 27 07:21:39 mongo-uniq1 kernel: CPU   15: hi:  186, btch:  31 usd:   0
Aug 27 07:21:39 mongo-uniq1 kernel: Node 0 Normal per-cpu:
Aug 27 07:21:39 mongo-uniq1 kernel: CPU    0: hi:  186, btch:  31 usd:   0
Aug 27 07:21:39 mongo-uniq1 kernel: CPU    1: hi:  186, btch:  31 usd:   0
Aug 27 07:21:39 mongo-uniq1 kernel: CPU    2: hi:  186, btch:  31 usd:   0
Aug 27 07:21:39 mongo-uniq1 kernel: CPU    3: hi:  186, btch:  31 usd:   0
Aug 27 07:21:39 mongo-uniq1 kernel: CPU    4: hi:  186, btch:  31 usd:   0
Aug 27 07:21:39 mongo-uniq1 kernel: CPU    5: hi:  186, btch:  31 usd:   0
Aug 27 07:21:39 mongo-uniq1 kernel: CPU    6: hi:  186, btch:  31 usd:   0
Aug 27 07:21:39 mongo-uniq1 kernel: CPU    7: hi:  186, btch:  31 usd:   0
Aug 27 07:21:39 mongo-uniq1 kernel: CPU    8: hi:  186, btch:  31 usd:   0
Aug 27 07:21:39 mongo-uniq1 kernel: CPU    9: hi:  186, btch:  31 usd:   0
Aug 27 07:21:39 mongo-uniq1 kernel: CPU   10: hi:  186, btch:  31 usd:   0
Aug 27 07:21:39 mongo-uniq1 kernel: CPU   11: hi:  186, btch:  31 usd:   0
Aug 27 07:21:39 mongo-uniq1 kernel: CPU   12: hi:  186, btch:  31 usd:   0
Aug 27 07:21:39 mongo-uniq1 kernel: CPU   13: hi:  186, btch:  31 usd:   0
Aug 27 07:21:39 mongo-uniq1 kernel: CPU   14: hi:  186, btch:  31 usd:   0
Aug 27 07:21:39 mongo-uniq1 kernel: CPU   15: hi:  186, btch:  31 usd:   0
Aug 27 07:21:39 mongo-uniq1 kernel: active_anon:31550706 inactive_anon:1013370 isolated_anon:128
Aug 27 07:21:39 mongo-uniq1 kernel: active_file:134 inactive_file:0 isolated_file:0
Aug 27 07:21:39 mongo-uniq1 kernel: unevictable:0 dirty:0 writeback:7 unstable:0
Aug 27 07:21:39 mongo-uniq1 kernel: free:142436 slab_reclaimable:5981 slab_unreclaimable:15205
Aug 27 07:21:39 mongo-uniq1 kernel: mapped:254 shmem:0 pagetables:69074 bounce:0
Aug 27 07:21:39 mongo-uniq1 kernel: free_cma:0
Aug 27 07:21:39 mongo-uniq1 kernel: Node 0 DMA free:15892kB min:4kB low:4kB high:4kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0k
B isolated(anon):0kB isolated(file):0kB present:15976kB managed:15892kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimabl
e:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
Aug 27 07:21:39 mongo-uniq1 kernel: lowmem_reserve[]: 0 1978 128952 128952
Aug 27 07:21:39 mongo-uniq1 kernel: Node 0 DMA32 free:508600kB min:704kB low:880kB high:1056kB active_anon:1133036kB inactive_anon:377844kB active_file:0kB inactive_fi
le:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:2046684kB managed:2026228kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaima
ble:1076kB slab_unreclaimable:3512kB kernel_stack:16kB pagetables:2944kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
Aug 27 07:21:39 mongo-uniq1 kernel: lowmem_reserve[]: 0 0 126974 126974
Aug 27 07:21:39 mongo-uniq1 kernel: Node 0 Normal free:45252kB min:45252kB low:56564kB high:67876kB active_anon:125069788kB inactive_anon:3675636kB active_file:540kB i
nactive_file:0kB unevictable:0kB isolated(anon):512kB isolated(file):0kB present:132120576kB managed:130021460kB mlocked:0kB dirty:0kB writeback:28kB mapped:1020kB shm
em:0kB slab_reclaimable:22848kB slab_unreclaimable:57308kB kernel_stack:2888kB pagetables:273352kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned
:1881 all_unreclaimable? yes
Aug 27 07:21:39 mongo-uniq1 kernel: lowmem_reserve[]: 0 0 0 0
Aug 27 07:21:39 mongo-uniq1 kernel: Node 0 DMA: 1*4kB (U) 0*8kB 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (U) 3*4096kB (MR) =
15892kB
Aug 27 07:21:39 mongo-uniq1 kernel: Node 0 DMA32: 101*4kB (UEM) 42*8kB (UEM) 49*16kB (UE) 167*32kB (UE) 110*64kB (UEM) 69*128kB (UEM) 46*256kB (UE) 36*512kB (UE) 31*10
24kB (UE) 23*2048kB (UE) 92*4096kB (UEMR) = 508628kB
Aug 27 07:21:39 mongo-uniq1 kernel: Node 0 Normal: 9505*4kB (UEM) 81*8kB (EM) 19*16kB (M) 25*32kB (MR) 9*64kB (MR) 8*128kB (R) 2*256kB (R) 1*512kB (R) 0*1024kB 0*2048k
B 1*4096kB (R) = 46492kB
Aug 27 07:21:39 mongo-uniq1 kernel: 10662 total pagecache pages
Aug 27 07:21:39 mongo-uniq1 kernel: 10178 pages in swap cache
Aug 27 07:21:39 mongo-uniq1 kernel: Swap cache stats: add 21060325, delete 21050147, find 47193946/47244483
Aug 27 07:21:39 mongo-uniq1 kernel: Free swap  = 0kB
Aug 27 07:21:39 mongo-uniq1 kernel: Total swap = 9789436kB
Aug 27 07:21:39 mongo-uniq1 kernel: 33554431 pages RAM
Aug 27 07:21:39 mongo-uniq1 kernel: 538296 pages reserved
Aug 27 07:21:39 mongo-uniq1 kernel: 4190101 pages shared
Aug 27 07:21:39 mongo-uniq1 kernel: 32870580 pages non-shared
Aug 27 07:21:39 mongo-uniq1 kernel: [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
Aug 27 07:21:39 mongo-uniq1 kernel: [ 2700]     0  2700 5360        1      15      165         -1000 udevd
Aug 27 07:21:39 mongo-uniq1 kernel: [ 4455]     0  4455 50367      570      36      740             0 rsyslogd
Aug 27 07:21:39 mongo-uniq1 kernel: [ 4521]   106  4521 121313      559      90     7851             0 mongod
Aug 27 07:21:39 mongo-uniq1 kernel: [ 4598]     0  4598 3248        0      10       36             0 mdadm
Aug 27 07:21:39 mongo-uniq1 kernel: [ 4637]   103  4637 9769       29      23      132             0 ntpd
Aug 27 07:21:39 mongo-uniq1 kernel: [ 4646]     0  4646 5129       52      14       25             0 irqbalance
Aug 27 07:21:39 mongo-uniq1 kernel: [ 4660]   101  4660 78549      481      71     7365             0 named
Aug 27 07:21:39 mongo-uniq1 kernel: [ 4664]     0  4664 12484       27      26      124         -1000 sshd
Aug 27 07:21:39 mongo-uniq1 kernel: [ 4838]     0  4838 5224       22      14       40             0 cron
Aug 27 07:21:39 mongo-uniq1 kernel: [ 4925]     0  4925 1030        0       7       36             0 acpid
Aug 27 07:21:39 mongo-uniq1 kernel: [ 4974]     0  4974 37265      452      33     1045             0 fail2ban-server
Aug 27 07:21:39 mongo-uniq1 kernel: [ 4994]     0  4994 12283      307      27     1545             0 munin-node
Aug 27 07:21:39 mongo-uniq1 kernel: [ 5055]     0  5055 5987      104      16      412             0 gam_server
Aug 27 07:21:39 mongo-uniq1 kernel: [ 5308]     0  5308 4186        0      12       39             0 getty
Aug 27 07:21:39 mongo-uniq1 kernel: [ 5309]     0  5309 4186        0      13       38             0 getty
Aug 27 07:21:39 mongo-uniq1 kernel: [ 5310]     0  5310 4186        0      12       38             0 getty
Aug 27 07:21:39 mongo-uniq1 kernel: [ 5311]     0  5311 4186        0      12       38             0 getty
Aug 27 07:21:39 mongo-uniq1 kernel: [ 5312]     0  5312 4186        0      12       38             0 getty
Aug 27 07:21:39 mongo-uniq1 kernel: [ 5313]     0  5313 4186        0      12       38             0 getty
Aug 27 07:21:39 mongo-uniq1 kernel: [ 5315]     0  5315 5359        0      14      165         -1000 udevd
Aug 27 07:21:39 mongo-uniq1 kernel: [ 5316]     0  5316 5359        0      14      165         -1000 udevd
Aug 27 07:21:39 mongo-uniq1 kernel: [16237]   106 16237    58006 13576      85    10814             0 mongos
Aug 27 07:21:39 mongo-uniq1 kernel: [19001]   106 19001   221422 1667     144     6120             0 mongod
Aug 27 07:21:39 mongo-uniq1 kernel: [ 2001]   107  2001 6332        8      16      122             0 nrpe
Aug 27 07:21:39 mongo-uniq1 kernel: [11628]     0 11628 17816       28      38      179             0 sshd
Aug 27 07:21:39 mongo-uniq1 kernel: [11630]     0 11630 5103        1      13      255             0 bash
Aug 27 07:21:39 mongo-uniq1 kernel: [ 2912]     0  2912 1530       11       7       16             0 tail
Aug 27 07:21:39 mongo-uniq1 kernel: [22462]   106 22462 35077302 32535764   68347  2410259             0 mongod
Aug 27 07:21:39 mongo-uniq1 kernel: Out of memory: Kill process 22462 (mongod) score 987 or sacrifice child
Aug 27 07:21:39 mongo-uniq1 kernel: Killed process 22462 (mongod) total-vm:140309208kB, anon-rss:130143056kB, file-rss:0kB 

/var/log/syslog:

[...]
2015-08-27T07:21:01.010+0200 I -        [rsSync]   Index Build: 3580500/15695121 22%
2015-08-27T07:21:01.463+0200 I NETWORK  [conn2376] end connection 172.16.21.2:55126 (4 connections now open)
2015-08-27T07:21:01.472+0200 I NETWORK  [initandlisten] connection accepted from 172.16.21.2:55128 #2378 (5 connections now open)
2015-08-27T07:21:04.007+0200 I -        [rsSync]   Index Build: 3613100/15695121 23%
2015-08-27T07:21:07.006+0200 I -        [rsSync]   Index Build: 3647100/15695121 23%
2015-08-27T07:21:10.007+0200 I -        [rsSync]   Index Build: 3680300/15695121 23%
2015-08-27T07:21:13.015+0200 I -        [rsSync]   Index Build: 3710300/15695121 23%
2015-08-27T07:21:16.015+0200 I -        [rsSync]   Index Build: 3736700/15695121 23%
2015-08-27T07:21:19.007+0200 I -        [rsSync]   Index Build: 3764300/15695121 23%
2015-08-27T07:21:19.214+0200 I NETWORK  [initandlisten] connection accepted from 172.16.21.3:53807 #2379 (6 connections now open)
2015-08-27T07:21:19.218+0200 I NETWORK  [conn2377] end connection 172.16.21.3:53803 (5 connections now open)
2015-08-27T07:21:22.021+0200 I -        [rsSync]   Index Build: 3790300/15695121 24%
2015-08-27T07:21:25.035+0200 I -        [rsSync]   Index Build: 3805100/15695121 24%
2015-08-27T07:21:25.418+0200 I COMMAND  [conn2379] command admin.$cmd command: replSetHeartbeat { replSetHeartbeat: "uniq", pv: 1, v: 7, from: "172.16.21.3:27020", fromId: 2, checkEmpty: false } ntoreturn:1 keyUpdates:0 writeConflicts:0 numYields:0 reslen:149 locks:{} 184ms
2015-08-27T07:21:28.012+0200 I -        [rsSync]   Index Build: 3827600/15695121 24%
2015-08-27T07:21:31.011+0200 I -        [rsSync]   Index Build: 3850500/15695121 24%
2015-08-27T07:21:31.877+0200 I NETWORK  [initandlisten] connection accepted from 172.16.21.2:55130 #2380 (6 connections now open)
2015-08-27T07:21:31.884+0200 I NETWORK  [conn2378] end connection 172.16.21.2:55128 (5 connections now open)
2015-08-27T07:21:34.007+0200 I -        [rsSync]   Index Build: 3873400/15695121 24%
2015-08-27T07:21:37.016+0200 I -        [rsSync]   Index Build: 3897400/15695121 24%
(END) 

/var/log/mongodb/mongodb.log:

mongo-wt.conf we use:
 
$ cat /etc/mongod-wt.conf
storage:
   dbPath: /var/lib/mongodb
   engine: wiredTiger
   wiredTiger.engineConfig.cacheSizeGB: 1
 
systemLog:
   destination: file
   path: /var/log/mongodb/mongodb.log
   logAppend: true
 
 
net:
   bindIp: 172.16.21.1
   http:
      enabled: false 

We use:

  • Debian Wheezy 7.7.
  • MongoDB 3.0.6

I've attached a RAM graph to this thread.
We can see Mongod using 1GB at first, and then saturate the available memory (128GB).



 Comments   
Comment by Anthony Pastor [ 21/Oct/15 ]

Ok perfect, we sucessfully completed the migration to WiredTiger using a higher ulimit value and using --wiredTigerEngineConfigString=file_manager=(close_idle_time=300).

Thanks a lot again for your feedback.

Comment by Ramon Fernandez Marina [ 16/Oct/15 ]

Hi anthony.pastor. WiredTiger keeps idle file handles open for about 24h to avoid delays when accessing collections and indexes. While this is a good choice for most users, deployments like yours with such a large number of collections may run into OS limits under certain circumstances.

One thing you can do is to lower the threshold for closing idle file handles with the following command line argument

--wiredTigerEngineConfigString=file_manager=(close_idle_time=NNN)

where NNN is the number of seconds a file handle needs to be idle for WiredTiger to close it. You can try setting a low value for NNN and evaluate how that affects your operations based on your data access patterns. Note however that WiredTiger needs at least two file handles per collection: one for the collection and one for every index (so at least one for the _id index), so if your schema contains a very large number of collections and/or indexes you'll need to adjust your OS limits accordingly.

To answer your question about my settings, I had a 100K limit on file handles and the initial sync completed successfully, but I only had the uniques database.

Comment by Anthony Pastor [ 16/Oct/15 ]

Hi,

For 3 days we try to rebuild our cluster with the 3.0.7 with no luck.
We reach the following error : too many open files.

2015-10-15T02:36:34.281+0200 D REPL [ReplicationExecutor] Scheduling heartbeat to 172.16.21.3:27020 at 2015-10-15T00:36:36.281Z
2015-10-15T02:36:34.329+0200 E STORAGE [rsSync] WiredTiger (24) [1444869394:329708][14846:0x7f60cfb43700], session.create: /var/lib/mongodb-rsm/index-1279384-7296191246901652904.wt: Too many open files
2015-10-15T02:36:34.330+0200 D STORAGE [rsSync] WT rollback_transaction
2015-10-15T02:36:34.330+0200 D - [rsSync] User Assertion: 8:24: Too many open files
2015-10-15T02:36:34.425+0200 E REPL [rsSync] 8 24: Too many open files
2015-10-15T02:36:34.425+0200 E REPL [rsSync] initial sync attempt failed, 0 attempts remaining

Currently we're trying to rebuild with a new open files limitation: 900 000

The number of file descriptor never decrease during the rebuilding.
Right now we have almost 500K open file descriptors, and we don't know yet if the rebuilding will succeed.

$ ls /proc/19802/fd/ | wc -l
481328

Is that normal file descriptor never decrease ?
When you succeed rebuilding our dataset, did you have this issue ? Could you tell us the ulimit setting you set ?

Thanks
Regards.

Comment by Anthony Pastor [ 18/Sep/15 ]

Thanks a lot, that's great !

Comment by Ramon Fernandez Marina [ 18/Sep/15 ]

anthony.pastor, this is to let you know that a fix for the bug you reported has made it into the MongoDB tree, so we've resolved this issue.

The fix will be in the upcoming 3.0.7 stable release, scheduled for late September / early October. If you can't wait you can download a nightly build for linux-64, with the strong recommendation that you upgrade to 3.0.7 when it becomes available.

Thanks again,
Ramón.

Comment by Michael Cahill (Inactive) [ 18/Sep/15 ]

Resolved by latest WiredTiger drop.

Comment by Githook User [ 17/Sep/15 ]

Author:

{u'username': u'agorrod', u'name': u'Alex Gorrod', u'email': u'alexander.gorrod@mongodb.com'}

Message: SERVER-20159 Make all readers wait while the cache is full.

Merge pull request #2197 from wiredtiger/SERVER-20159

(cherry picked from commit 3b723612cd3273662c0fd2d74603047e0ba99460)
Branch: mongodb-3.0
https://github.com/wiredtiger/wiredtiger/commit/07dfbfa48cd19b462bb7de08c394ec7e3842dbf9

Comment by Githook User [ 17/Sep/15 ]

Author:

{u'username': u'agorrod', u'name': u'Alex Gorrod', u'email': u'alexander.gorrod@mongodb.com'}

Message: SERVER-20159 Make all readers wait while the cache is full.

Merge pull request #2197 from wiredtiger/SERVER-20159

(cherry picked from commit 3b723612cd3273662c0fd2d74603047e0ba99460)
Branch: mongodb-3.0
https://github.com/wiredtiger/wiredtiger/commit/07dfbfa48cd19b462bb7de08c394ec7e3842dbf9

Comment by Githook User [ 16/Sep/15 ]

Author:

{u'username': u'agorrod', u'name': u'Alex Gorrod', u'email': u'alexander.gorrod@mongodb.com'}

Message: Merge pull request #2197 from wiredtiger/SERVER-20159

SERVER-20159 Make all readers wait while the cache is full.
Branch: develop
https://github.com/wiredtiger/wiredtiger/commit/3b723612cd3273662c0fd2d74603047e0ba99460

Comment by Githook User [ 16/Sep/15 ]

Author:

{u'username': u'agorrod', u'name': u'Alex Gorrod', u'email': u'alexander.gorrod@mongodb.com'}

Message: Merge pull request #2197 from wiredtiger/SERVER-20159

SERVER-20159 Make all readers wait while the cache is full.
Branch: develop
https://github.com/wiredtiger/wiredtiger/commit/3b723612cd3273662c0fd2d74603047e0ba99460

Comment by Githook User [ 16/Sep/15 ]

Author:

{u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'}

Message: SERVER-20159 Make all readers wait while the cache is full.
Branch: develop
https://github.com/wiredtiger/wiredtiger/commit/e548f7fd29d38481d239dd7aa26a1e2e352178b6

Comment by Anthony Pastor [ 14/Sep/15 ]

Thank you very much for your efforts !

Comment by Ramon Fernandez Marina [ 14/Sep/15 ]

anthony.pastor, we believe we have identified the root cause of this bug, and are actively investigating a fix.

Thanks,
Ramón.

Comment by Michael Cahill (Inactive) [ 14/Sep/15 ]

I think the problem here is this code in __wt_cache_eviction_worker:

1532                 /*
1533                  * The cache is still full and no pages were found in the queue
1534                  * to evict.  If this transaction is the one holding back the
1535                  * oldest ID, we can't wait forever.  We'll block next time we
1536                  * are not busy.
1537                  */
1538                 if (busy) {
1539                         __wt_txn_update_oldest(session, 0);
1540                         if (txn_state->id == txn_global->oldest_id ||
1541                             txn_state->snap_min == txn_global->oldest_id)
1542                                 return (0);
1543                 }

In this workload, we will always have busy==1 during the walk, and I think it will also quickly have the oldest transaction ID in the system because it is single-threaded.

This code exists to avoid deadlocks in workloads with tiny caches: I'll try to remove it and see whether we can avoid those hangs in some less problematic way.

Comment by Keith Bostic (Inactive) [ 11/Sep/15 ]

Here's the interesting threads after the WT cache has started growing. From the WT stats: Sep 11 16:47:09 34912239018 /home/rf/tmp/SERVER-20159/wt/db cache: bytes currently in the cache

Thread 8 (Thread 0x7f12e43e1700 (LWP 26204)):
#0  0x00007f12e7fa3fdd in fdatasync () at ../sysdeps/unix/syscall-template.S:81
#1  0x000000000136646c in __wt_fsync ()
#2  0x000000000139d957 in ?? ()
#3  0x000000000139f733 in __wt_txn_checkpoint ()
#4  0x00000000013925a6 in ?? ()
#5  0x000000000132abba in ?? ()
#6  0x00007f12e8eab182 in start_thread (arg=0x7f12e43e1700) at pthread_create.c:312
#7  0x00007f12e7fac47d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
 
Thread 7 (Thread 0x7f12e3be0700 (LWP 26205)):
#0  0x00007f12e8eb2b9d in nanosleep () at ../sysdeps/unix/syscall-template.S:81
#1  0x0000000000f68333 in mongo::sleepmicros(long long) ()
#2  0x0000000000d7bb62 in ?? ()
#3  0x0000000000eedda0 in mongo::BackgroundJob::jobBody() ()
#4  0x0000000000fac5f4 in ?? ()
#5  0x00007f12e8eab182 in start_thread (arg=0x7f12e3be0700) at pthread_create.c:312
#6  0x00007f12e7fac47d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
 
Thread 3 (Thread 0x7f12e1bdc700 (LWP 27046)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x0000000000ed63e4 in mongo::DeadlineMonitor<mongo::V8Scope>::deadlineMonitorThread() ()
#2  0x0000000000fac5f4 in ?? ()
#3  0x00007f12e8eab182 in start_thread (arg=0x7f12e1bdc700) at pthread_create.c:312
#4  0x00007f12e7fac47d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
Thread 2 (Thread 0x7f12e13db700 (LWP 27075)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
#1  0x0000000000beab4f in mongo::RangeDeleter::doWork() ()
#2  0x0000000000fac5f4 in ?? ()
#3  0x00007f12e8eab182 in start_thread (arg=0x7f12e13db700) at pthread_create.c:312
#4  0x00007f12e7fac47d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
 
Thread 1 (Thread 0x7f12e92bbbc0 (LWP 26190)):
#0  0x00007f12e8eb2cc3 in pread64 () at ../sysdeps/unix/syscall-template.S:81
#1  0x0000000001367d3b in __wt_read ()
#2  0x00000000012e7fde in __wt_block_read_off ()
#3  0x00000000012e82e2 in __wt_bm_read ()
#4  0x000000000130202e in __wt_bt_read ()
#5  0x0000000001306cd9 in __wt_cache_read ()
#6  0x0000000001303d33 in __wt_page_in_func ()
#7  0x000000000131800f in ?? ()
#8  0x0000000001318d32 in __wt_tree_walk ()
#9  0x00000000012ec7c6 in __wt_btcur_next ()
#10 0x0000000001336a89 in ?? ()
#11 0x0000000000d769d3 in mongo::WiredTigerRecordStore::Iterator::_getNext() ()
#12 0x0000000000d76a82 in mongo::WiredTigerRecordStore::Iterator::getNext() ()
#13 0x00000000009e45b2 in mongo::CollectionScan::work(unsigned long*) ()
#14 0x0000000000bc5834 in mongo::PlanExecutor::getNextSnapshotted(mongo::Snapshotted<mongo::BSONObj>*, mongo::RecordId*) ()
#15 0x000000000090fd1b in mongo::MultiIndexBlock::insertAllDocumentsInCollection(std::set<mongo::RecordId, std::less<mongo::RecordId>, std::allocator<mongo::RecordId> >*) ()
#16 0x0000000000a98f5d in mongo::restartInProgressIndexesFromLastShutdown(mongo::OperationContext*) ()
#17 0x00000000007e4dff in mongo::initAndListen(int) ()
#18 0x00000000007e88c9 in main ()

Comment by Ramon Fernandez Marina [ 11/Sep/15 ]

anthony.pastor, after several attempts I've been able to reproduce the symptoms described in this ticket: during the _id index build in uniques.hll_l13_r6 memory consumption climbs enough for mongod to get killed by the OOM killer.

I'm now going over the server data I collected in the process. We're keeping this ticket in "debugging with submitter" state until we find out the root cause of the problem.

Thanks again for all your help so far.

Regards,
Ramón.

Comment by Ramon Fernandez Marina [ 09/Sep/15 ]

anthony.pastor, this is just to let you know that I was able to start up a node with the data files you sent us. I set up a secondary running WiredTiger and it's now going through initial sync. Given the sheer amount of data this will take a bit, but we're making progress.

Regards,
Ramón.

Comment by Ramon Fernandez Marina [ 08/Sep/15 ]

Thanks anthony.pastor, I didn't see 17 and 18 last week when I downloaded the rest of the files. Perhaps they were affected by the same problem as the previous missing files?

In any event, I see them now and have downloaded them to my computer. We'll keep you posted on my reproduction efforts.

Comment by Anthony Pastor [ 08/Sep/15 ]

Upload completed.

Comment by Anthony Pastor [ 08/Sep/15 ]

MD5 hashes are the same.
You don't mention .part17 and .part18 which are part of the list.

I'm re-uploading them in case you haven't received those ones.

Comment by Ramon Fernandez Marina [ 08/Sep/15 ]

anthony.pastor, I received an error from tar when extracting the file. Give the sheer size of the data it's taking me a while to find out what's the part.NN file that's causing the error. Here are the MD5 checksums of the files I have:

5b852e31a5e2f841605e12bfc3b8000a  part.00
57c2e486b14f2a33f6243b23a113f77b  part.01
aece1ec729454e9975cec3b007b6cabd  part.02
4cc87851147a8f81f5fe2e1d063b20c7  part.03
8359ce2f3edeebb120b60713d701be10  part.04
c340ed178295b30106f6f16d16e33fa9  part.05
f4ea044bb7861811bd0de07bb7f8e5cb  part.06
269aa5f65fb74b16e96c4cde00e0c6d8  part.07
613d93ea9d5e976ff59ebb02c619b6e5  part.08
c58b536e28d83a830d7eb697dd61c652  part.09
8704a502f296910c79218876028d2db2  part.10
0b90db02dea007f855b2951e80473dad  part.11
78767bdbe380276162d84aaca552003a  part.12
7edf843cbe7a8f4032b9b251e046d760  part.13
9b9843ac2791aeecec51dd6a87aa2412  part.14
7b642e3c379f57f0669e6a00704a60e6  part.15
f769c611625bc1c95cb509f4b7dcce02  part.16

If you still have the part.NN files and can run md5sum on them and verify the checksum against the list above we should be able to tell which file(s) got corrupted in flight.

We're actively working to get to the bottom of this issue, and your help and patience are much appreciated.

Regards,
Ramón.

Comment by Anthony Pastor [ 07/Sep/15 ]

Could you reproduce the problem with the datafiles ?

Comment by Ramon Fernandez Marina [ 02/Sep/15 ]

Thanks for the additional information and for uploading the missing files anthony.pastor, much appreciated. We believe the behavior you're seeing could be explained by SERVER-20248, and we hope the data you just uploaded helps us determine whether that's the case or not and find a root cause for this problem.

Comment by Anthony Pastor [ 02/Sep/15 ]

I uploading the missing part right now.

The reconstruction failed again during an index rebuilding.
2015-09-01T18:53:53.617+0200 I INDEX [rsSync] build index on: uniques.hll_l13_r6 properties: { v: 1, key:

{ _id: 1 }

, name: "id", ns: "uniques.hll_l13_r6" }

Comment by Ramon Fernandez Marina [ 01/Sep/15 ]

anthony.pastor, the files I listed above are missing from the upload portal. I was convinced I saw part.01 and part.01 before, but we couldn't find any problems with the S3 bucket used for this purpose and we couldn't find a scenario that would make that possible (e.g.: partial update).

Could you please re-upload them?

Thanks,
Ramón.

Comment by Ramon Fernandez Marina [ 01/Sep/15 ]

Thanks for the heads-up anthony.pastor – and for uploading the files! Downloading them now, will post updates to this ticket as they become available.

EDIT:
Hmmm, for some reason the following files are not listed:

  • part.01
  • part.05
  • part.08
  • part.09
  • part.10
  • part.13

Will check to see if there was an error on our end and let you know if you need to re-upload these. Apologies for the inconvenience.

Comment by Anthony Pastor [ 01/Sep/15 ]

Uploads are completed.

Comment by Bruce Lucas (Inactive) [ 01/Sep/15 ]

Thanks Anthony. We can indeed see part.00 through part.06, about 5 GB each. Let us know when the upload completed.
-Bruce

Comment by Anthony Pastor [ 01/Sep/15 ]

I re-started the secondary, it rebuilded indexes.
This time it seems like hll_l13_r6 collection rebuild successfully.

Used RAM is still very high as the rebuilding is going on. (less than 3GB free on a 128GB server).

About the datafiles:

The compressed file is 93GB.
I'm uploading 18 part.* files through your platform/
Could you confirm you are correctly receiving the first parts please ?

Comment by Ramon Fernandez Marina [ 29/Aug/15 ]

The captured data shows a spike in the bytes held in the cache at the very end, right before the node is killed by the OS:

I also found the following in the logs:

2015-08-27T16:35:59.674+0200 I CONTROL  [initandlisten] ** WARNING: soft rlimits too low. rlimits set to 32000 processes, 1000
00 files. Number of processes should be at least 50000 : 0.5 times number of files.
[...]
2015-08-27T21:46:21.480+0200 I INDEX    [rsSync] build index done.  scanned 1 total records. 0 secs
2015-08-27T21:46:21.993+0200 E STORAGE  [rsSync] WiredTiger (24) [1440704781:993666][7163:0x7feaa57bb700], session.create: /var/lib/mongodb-rsm/index-199942--8063021761882435496.wt: Too many open files
2015-08-27T21:46:22.020+0200 E REPL     [rsSync] 8 24: Too many open files
2015-08-27T21:46:22.020+0200 E REPL     [rsSync] initial sync attempt failed, 9 attempts remaining
2015-08-27T21:46:27.020+0200 I REPL     [rsSync] initial sync pending
2015-08-27T21:46:27.021+0200 I REPL     [ReplicationExecutor] syncing from: 172.16.21.2:27017
2015-08-27T21:46:27.052+0200 I REPL     [rsSync] initial sync drop all databases
2015-08-27T21:46:27.052+0200 I STORAGE  [rsSync] dropAllDatabasesExceptLocal 3

Not sure if this is relevant yet.

Comment by Ramon Fernandez Marina [ 28/Aug/15 ]

anthony.pastor, here are the next steps we can try to investigate this problem:

  1. Restart the secondary node. The node should re-try to rebuild the index; if the node gets killed again then we know for sure your specific dataset can trigger this problem. Assuming the problem reproduces:
  2. Calculate the size of /var/lib/mongodb on the secondary:

    du -csk /var/lib/mongodb
    

  3. If the size is below 5GB, tar the whole directory:

    tar czf dbpath.tgz /var/lib/mongodb
    

  4. Upload dbpath.tgz: point your browser to this upload portal, and either drag-and-drop dbpath.tgz to this page, or click on the left box to select dbpath.tgz for upload, or if you prefer using the command line you can use the curl command shown in the right box

The contents of /var/lib/mongodb from the secondary should be enough to recreate this problem on our end. If in step 3 above the size of the data is larger than 5GB, then you can run:

split -d -b 5300000000 dbpath.tgz part.

This will produce a series of part.XX where XX is a number; you can then upload those files via the same portal described above.

Comment by Anthony Pastor [ 28/Aug/15 ]

On the secondary node i did :

rs.slaveOk()
use uniques
db.hll_l13_r6.stats().wiredTiger.uri

Here's the output:
statistics:table:collection-787--3929346498517275754

On the primary we have almost 1.3TB of uniques.* files.
Could we copy on a FTP a send you a private message for instance ?

Comment by Ramon Fernandez Marina [ 28/Aug/15 ]

Apologies if I wasn't clear anthony.pastor – yes, the above instructions are for the secondary. On the primary there's a file per database, so you'd need to upload /var/lib/mongodb/uniques.*.

The reason I sent instructions for the secondary is because WiredTiger uses a file per collection, and since index building happens after the collection has been replicated all the collection data should be present in the secondary – but I also realize that in the primary the data files will be 2GB in size at most, so it will be easier to upload /var/lib/mongodb/uniques.*.

bruce.lucas@10gen.com, do we need the data from the primary, the secondary, or it doesn't matter?

Thanks,
Ramón.

Comment by Anthony Pastor [ 28/Aug/15 ]

The primary isn't using WiredTiger yet.
I can't use: db.hll_l13_r6.stats().wiredTiger.uri

Comment by Ramon Fernandez Marina [ 28/Aug/15 ]

Having at 2.6.10 primary and a 3.0.x secondary should work, as that's the upgrade path for replica sets, so nothing wrong with that. To find the exact collection name you can run this from the mongo shell:

rs.slaveOk()
use uniques
db.hll_l13_r6.stats().wiredTiger.uri

You should find the corresponding file /var/lib/mongodb. Please note that the upload portal limits files to 5GB in size; if this file is larger it will need to be split – let me know if you need a recipe for doing that.

Thanks,
Ramón.

Comment by Anthony Pastor [ 28/Aug/15 ]

Thank you for you answer.

I notice i haven't wrote that the alive primary is still running 2.6.10 without WiredTiger yet.

The issue is happening during the switch to WiredTiger.

How could i get the exact file name on disk for this collection ?

Comment by Ramon Fernandez Marina [ 28/Aug/15 ]

Thanks for collecting this information anthony.pastor, we're taking a look now. I tried to reproduce this behavior yesterday, unsuccessfully.

A colleague points out that this may be very difficult to reproduce and that we may need to see the database files directly to understand what's triggering this condition. If you would consider sharing the uniques.hll_l13_r6 collection with us you may upload it privately and securely via this portal. You can find the exact file name on disk for this collection by running the following from the shell:

use uniques
db.hll_l13_r6.stats().wiredTiger.uri

Thanks,
Ramón.

Comment by Anthony Pastor [ 28/Aug/15 ]

Hi,

I restarted the replication yesterday to capture the metrics you asked.
I attached to my first message the file: mongodb_crash.tar.gz containing all metrics you asked (with a 10s delay for iostat and ServerStatus).

The Mongod was once again killed by OOM during index building on the same collection:

uniq:PRIMARY> db.hll_l13_r6.stats()
{
	"ns" : "uniques.hll_l13_r6",
	"count" : 15695121,
	"size" : 957936455216,
	"avgObjSize" : 61034,
	"storageSize" : 960173003840,
	"numExtents" : 467,
	"nindexes" : 2,
	"lastExtentSize" : 2146426864,
	"paddingFactor" : 1,
	"systemFlags" : 1,
	"userFlags" : 1,
	"totalIndexSize" : 1487631376,
	"indexSizes" : {
		"_id_" : 980653968,
		"_id.i_1" : 506977408
	},
	"ok" : 1
}

Comment by Anthony Pastor [ 28/Aug/15 ]
  • MongoDB RAM Graph.
  • MongoDB Log.
  • iostat.
  • ServerStatus.
Comment by Ramon Fernandez Marina [ 27/Aug/15 ]

Thanks for your report anthony.pastor, and sorry you're experiencing this problem. I'd say this is definitely unexpected, so we'll need some more information to understand what's happening. I'd like to ask for two things.

  1. The output of db.collection.stats() for the collection where this index build is happening. Basically I'd like to see the document size to try to reproduce this exact situation on my end. Any additional details about the size of the documents and about this index may be useful as well.

  2. Could you re-run the initial sync while collecting server statistics on the secondary node? You'd need to open a shell and run the following:

    # Delay in seconds
    delay=1
    mongo --eval "while(true) {print(JSON.stringify(db.serverStatus({tcmalloc:1}))); sleep($delay*1000)}" >ss.log &
    iostat -k -t -x $delay >iostat.log &
    

    You can adjust the delay depending on how long this issue takes to trigger; if it's, say, 24h, the delay can be 10s to prevent the resulting files from being too large. If you could then upload these files along with the mongod.log for the whole process that should give us sufficient information to understand the source of the problem.

Thanks,
Ramón.

Generated at Thu Feb 08 03:53:21 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.