[SERVER-27818] mongod 3.2.11 out of memory - killed by OOM killer Created: 26/Jan/17 Updated: 31/Jan/17 Resolved: 31/Jan/17 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | 3.2.11 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Critical - P2 |
| Reporter: | Meni Livne | Assignee: | Kelsey Schubert |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
EC2 m3.xlarge server with 15GB of RAM and 15GB of swap space. |
||
| Attachments: |
|
| Operating System: | ALL |
| Participants: |
| Description |
|
Our deployment has 4 shards running version 3.2.11, each consisting of a primary, secondary and arbiter, with the wiredTiger engine. Primaries and secondaries are on m3.xlarge servers with 15 GB RAM and 15 GB of swap space. While running smoothly for several months, recently mongod can suddenly be killed by the kernel due to running out of memory. This usually happens on secondaries, but can also happen on primaries. The servers do not seem to be under unusual query load when it happens. We do not have any text indexes so it does not seem related to Attached is example dmesg output from one of the shards. We can also provide contents of diagnostic.data directory. |
| Comments |
| Comment by Kelsey Schubert [ 30/Jan/17 ] |
|
It's looks like the issue you are experiencing is likely caused by the configuration of Xen. Memory ballooning and/or a full swap could explain this behavior. From the information you have provided, I do not see anything to indicate a bug in the MongoDB server. For MongoDB-related support discussion please post on the mongodb-user group or Stack Overflow with the mongodb tag. A question like this involving more discussion would be best posted on the mongodb-users group. Kind regards, |
| Comment by Meni Livne [ 28/Jan/17 ] |
|
There are no containers. As for limits, max resident set is unlimited. These are the full limits for the mongod process. Limit Soft Limit Hard Limit Units (Kernel version: Linux 4.4.0-59-generic #80-Ubuntu SMP Fri Jan 6 17:47:47 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux) |
| Comment by Kelsey Schubert [ 27/Jan/17 ] |
|
Thank you for uploading the files. I've examined the system logs and see that total memory utilization when the system killed mongod was 8259476kB or about 8GB. Are you aware of any containers or ulimits that are constraining the memory available? Kind regards, |
| Comment by Meni Livne [ 26/Jan/17 ] |
|
We've uploaded the diagnostic.data files for the 3 shards whose secondaries experienced the OOM. The files were copied when the mongod processes were down after being killed by the kernel. We've also uploaded the log files for those servers, from the time they were brought up until being killed and afterwards. |
| Comment by Kelsey Schubert [ 26/Jan/17 ] |
|
Would you please upload the diagnostic.data and complete logs for the affected node to this secure portal? Thank you, |