[SERVER-69745] Memory not entirely used, Mongo uses swap Created: 13/Sep/22 Updated: 27/Sep/22 Resolved: 27/Sep/22 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Sorint Lab | Assignee: | Chris Kelly |
| Resolution: | Done | Votes: | 0 |
| Labels: | WiredTiger, cache, cachesize, memory-management | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Red Hat Enterprise Linux Server release 7.6 (Maipo) |
||
| Attachments: |
|
||||||||||||||||
| Issue Links: |
|
||||||||||||||||
| Operating System: | ALL | ||||||||||||||||
| Steps To Reproduce: | Red Hat Enterprise Linux Server release 7.6 (Maipo) |
||||||||||||||||
| Participants: | |||||||||||||||||
| Description |
|
On a production 3 nodes mongodb replicaset we're observing that the db is using only 60% of the ram, then it uses the swap memory. This causes severe slowness at application level. Parameter "cacheSize" is not set: from the docs we read that by default mongo will use 50% of the avalaible ram minus 1Gb. OS swappiness parameter is set to 1. We see that is suggested to avoid increasing WT internal cache size above its default value. Would it be ok to increase it anyway, in this case?
Thank you. |
| Comments |
| Comment by Chris Kelly [ 27/Sep/22 ] |
|
Hello, After looking at your diagnostic data, I see a pattern of queuing/high utilization occurring on your storage disks where time is spent waiting on ss wt cache application threads page read from disk to cache. The most time always seems to be spent going from disk to cache, and not the other way around. This correlates with increases in your system memory swap cached. Your WT cache contains about 25GB, or 80% at basically all times. However, your ss mem resident is fluctuating between 35GB and 44GB alongside instances of memory fragmentation. You could consider experimenting with swappiness or trying tcmallocAggressiveMemoryDecommit to aggressively return the free pages to the OS where they can be reused by tcmalloc to satisfy new memory requests: db.adminCommand( { setParameter: 1, tcmallocAggressiveMemoryDecommit: 1 } )
In general, there is a cyclic pattern of memory fragmentation occurring every few hours it seems which could be contributing to this issue as pages are loaded from disk to cache. There could be other possibilities however. MongoDB 4.0 reached end of life in April 2022 and is no longer supported, so I'm going to close this ticket for now. If you suspect a bug in 4.2+ however, we would be interested in investigating it here in the SERVER project. I recommend consulting the MongoDB Developer Community Forums for additional information in the meantime. Regards, Christopher
|
| Comment by Chris Kelly [ 15/Sep/22 ] |
|
Hello, We have received your files and will update the ticket when we have further information. Christopher |
| Comment by Sorint Lab [ 15/Sep/22 ] |
|
Hello Chris, can you confirm you received the files? Is there any update?
Thank you. |
| Comment by Sorint Lab [ 14/Sep/22 ] |
|
Hello, we attached the files requested. The issue has been occurring for several weeks but it intensified during the last days, so we provided you with logs from yesterday and today.
Thank you. |
| Comment by Chris Kelly [ 13/Sep/22 ] |
|
Hello, I've created a secure upload portal for you. Files uploaded to this portal are hosted on Box, are visible only to MongoDB employees, and are routinely deleted after some time. For each node in the replica set spanning a time period that includes the incident, would you please archive (tar or zip) and upload to that link:
Regards, |