[SERVER-34382] Memory and swap consumption Created: 09/Apr/18 Updated: 27/Oct/23 Resolved: 03/May/18 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Performance, WiredTiger |
| Affects Version/s: | 3.4.10, 3.4.14 |
| Fix Version/s: | None |
| Type: | Question | Priority: | Major - P3 |
| Reporter: | Benoit Bui | Assignee: | Bruce Lucas (Inactive) |
| Resolution: | Works as Designed | Votes: | 0 |
| Labels: | SWNA | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||||||
| Issue Links: |
|
||||||||||||||||
| Participants: | |||||||||||||||||
| Description |
|
Hi, On a single server we're hosting 4 different mongod instances from two different sharded clusters:
Server has following specs: We have 20 servers with pretty much the same deployment. We currently are facing issues with memory and swap consumption. We recently restarted the secondary on a server and it appear the primary uses almost 40GB of RAM.
We tried to limit WTCacheSizeGB to 12GB per mongod few weeks ago but the issue still stands.
It can be seen that clust-users-1-shard8-1 (primary) uses 55.5% of the memory. We do not understand how and why a single instance can use more than half the total memory of the server despite having set the WiredTigerCacheSizeGB. You'll find attached the configuration files from each instances including arbiters. Thanks, |
| Comments |
| Comment by Bruce Lucas (Inactive) [ 03/May/18 ] | |||||||||||||||||||||||||
|
Hi Benoit, As indicated above, it is expected that WiredTiger will use about 30 kB of memory that is not accounted for as part of the cache for each open table (index or collection), so you will need to account for this memory requirement if you have a large number of tables by either reducing the cache size correspondingly or by increasing the amount of memory on the machine. I've opened Thanks for reporting this; I'll close this ticket now because this is expected behavior. Bruce | |||||||||||||||||||||||||
| Comment by Benoit Bui [ 03/May/18 ] | |||||||||||||||||||||||||
|
Hi Team, The issue is becoming more and more present and has a real impact on production. Here's a swap consumption on the last 48hrs on our 4GB swap server.
Memory consumption doesn't seem to have fluctuated much on this period, only swap moved that much. On the graph, when swap goes up to 100% it's because we restart the mongo instances.
Note that the swap usage gap isn't due to a restart, this happens on normal run time. As usual, I've uploaded the `diagnostic.data` directory from the server onto the secure portal.
Thanks again.
Regards, Benoit | |||||||||||||||||||||||||
| Comment by Ian Whalen (Inactive) [ 30/Apr/18 ] | |||||||||||||||||||||||||
|
kelsey.schubert can we assign to you? Doesn't look this is is currently on the Storage team to do anything in it's DWS state? | |||||||||||||||||||||||||
| Comment by Bruce Lucas (Inactive) [ 25/Apr/18 ] | |||||||||||||||||||||||||
|
Hi Benoit, We can confirm that a memory overhead of about 30 kB per open table is expected, and this is not accounted for as part of the cache because it is not table data per se, but rather bookkeeping overhead. So for use cases such as yours where you have a very large number of collections and indexes you will need to take this into account when evaluating memory requirements. Bruce | |||||||||||||||||||||||||
| Comment by Benoit Bui [ 25/Apr/18 ] | |||||||||||||||||||||||||
|
Hi, I've added new diagnostics.data archives to the secure portal. We have a server (ss2-bl9) that consumed all its swap this week up to the point a mongod was killed by oom-killer. I have a haard time to understand why it is swapping and why one mongod is using up to 60% of memory (a bit more than 40GB on our case) despite the WTCacheSize being small. Thanks | |||||||||||||||||||||||||
| Comment by Bruce Lucas (Inactive) [ 19/Apr/18 ] | |||||||||||||||||||||||||
|
Hi Benoit, Thanks for the clarification, and understood. During startup a portion of the files are opened (specifically, the indexes) to do some start-up checks. The amount of memory used then can vary later on as files are opened and closed, depending on application access patterns. We're working now to verify that the magnitude of memory used is as expected. Thanks, | |||||||||||||||||||||||||
| Comment by Benoit Bui [ 19/Apr/18 ] | |||||||||||||||||||||||||
|
Hi Bruce, I'll just add some infos here in case I did not say it or at least not clearly. I'll let you investigate on the data you have. Thanks, | |||||||||||||||||||||||||
| Comment by Bruce Lucas (Inactive) [ 18/Apr/18 ] | |||||||||||||||||||||||||
|
Thanks for the additional information Benoit. From ps and gdb data you uploaded previously we can see that, as you suspected, the increase in memory occurred at the point where WiredTiger is opening a large number of files during startup. As you know, there is a separate WiredTiger file for each table, where a table is an index or a collection. There is some per-table overhead in WiredTiger, but we are still checking whether the amount of memory overhead per table that you are experiencing is expected. Bruce | |||||||||||||||||||||||||
| Comment by Benoit Bui [ 18/Apr/18 ] | |||||||||||||||||||||||||
|
Hi, We experienced the same issue during the night. This shows the gap in swap usage. I've added the logs from the nodes on the server and also the $dbpath/diagnostics.data directory on the secure portal. Thanks, | |||||||||||||||||||||||||
| Comment by Benoit Bui [ 17/Apr/18 ] | |||||||||||||||||||||||||
|
Hi Bruce, I've uploaded the files you requested. About the 10mins without any logs, we always attributed this behaviour to the listing of databases files. Thanks, | |||||||||||||||||||||||||
| Comment by Bruce Lucas (Inactive) [ 16/Apr/18 ] | |||||||||||||||||||||||||
|
Hi Benoit, Thanks for the log files. They show no unusual messages but there is an unusual gap of about 10 minutes with no messages. I suspect the size of the excess may be related to the size of this gap. However during the gap a number of things may be happening that don't leave a log trace. Also, this is occurring before our internal data collection has started, so it does not provided useful information about this problem.
However if my belief that the memory increase is occurring during this long gap is correct, we may be able to collect some external data using ps and gdb to help understand what is happening. Would you be able to restart one of the affected mongod processes, and then immediately start some data collection as follows:
This will capture, in parallel, memory usage information once per second, and stack traces every 5 seconds. Then when mongod has reached the point where it is ready to accept connections, if it follows the same pattern as before it will be using excessive memory. At this point then please upload
Thanks, | |||||||||||||||||||||||||
| Comment by Benoit Bui [ 16/Apr/18 ] | |||||||||||||||||||||||||
|
Hi Bruce, I've added the logs covering the span you asked on the portal. Thanks, | |||||||||||||||||||||||||
| Comment by Bruce Lucas (Inactive) [ 16/Apr/18 ] | |||||||||||||||||||||||||
|
Hi Benoit, The data in clust-users-1-shard5-1 and clust-users-2-shard3-1 both show a very large excess of 10-20 GB of memory usage. In both cases the increase happened very soon after startup on 04-09 (UTC). Can you please also upload the mongod log files for those nodes covering the time between that restart and now? This may provide information that we need to understand the reason for the large increase in memory use. Thanks, | |||||||||||||||||||||||||
| Comment by Benoit Bui [ 16/Apr/18 ] | |||||||||||||||||||||||||
|
Hi Bruce, Thanks for your answer and the secure upload portal. We had a server on which we had the issue in the las 7 days and which had a node restarted in the meantime. Thanks, | |||||||||||||||||||||||||
| Comment by Bruce Lucas (Inactive) [ 12/Apr/18 ] | |||||||||||||||||||||||||
|
Hi Benoit, How long after restarting a node does it take to develop this problem? I'm interested in seeing the history over time of memory usage by the mongod processes; we keep about a week or so of data in diagnostic.data. So
By "data from that node" I mean, for both mongod processes running on the node
I've generated a secure upload portal for this data. Thanks, |