-
Type: Bug
-
Resolution: Done
-
Priority: Major - P3
-
None
-
Affects Version/s: 3.0.5
-
Component/s: WiredTiger
-
Labels:None
-
ALL
-
We have upgraded our replicated sharded mongo db setup to the latest 3.0.5 i hope of fixing OOM issues we are having after we migrated our Storage Engine from MMAP to WT but our memory usage issues didn't go away. Memory usage increases over time and only restart releases allocated memory.
We are running 4 shards on Ubuntu Server(primary instance in our 3 member replica set) having 60 GB System Memory with WT as Storage Engine. We set Cache Size to 13 GB for each Shard Server leaving 8 GB memory for System processes and if Mongo requires more memory for Open Cursors, Open Sessions etc. But it uses way more and System kills the process.
Two of the four shard servers running on our primary instance failed with OOM error due to system kill.
Please find attached db.serverStatus(
{tcmalloc:true}) captured for all the four shard servers running on Primary captured from 1 hour before the failure occurred. Also attached syslog which logged the system kill actions of the two shard servers.