[SERVER-69745] Memory not entirely used, Mongo uses swap Created: 13/Sep/22  Updated: 27/Sep/22  Resolved: 27/Sep/22

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Sorint Lab Assignee: Chris Kelly
Resolution: Done Votes: 0
Labels: WiredTiger, cache, cachesize, memory-management
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Red Hat Enterprise Linux Server release 7.6 (Maipo)
Mongodb with 3 replicaset


Attachments: PNG File image-2022-09-27-05-51-56-470.png     PNG File image-2022-09-27-05-58-57-580.png    
Issue Links:
Duplicate
is duplicated by SERVER-69744 Query optimization Closed
Related
related to WT-9855 Memory not entirely used, switch on swap Closed
Operating System: ALL
Steps To Reproduce:

Red Hat Enterprise Linux Server release 7.6 (Maipo)
Mongodb with 3 replicaset

Participants:

 Description   

On a production 3 nodes mongodb replicaset we're observing that the db is using only 60% of the ram, then it uses the swap memory. This causes severe slowness at application level.

Parameter "cacheSize" is not set: from the docs we read that by default mongo will use 50% of the avalaible ram minus 1Gb.

OS swappiness parameter is set to 1.

We see that is suggested to avoid increasing WT internal cache size above its default value.

Would it be ok to increase it anyway, in this case?
Otherwise, increasing server ram could be a solution?

 

Thank you.



 Comments   
Comment by Chris Kelly [ 27/Sep/22 ]

Hello,

After looking at your diagnostic data, I see a pattern of queuing/high utilization occurring on your storage disks where time is spent waiting on ss wt cache application threads page read from disk to cache. The most time always seems to be spent going from disk to cache, and not the other way around. This correlates with increases in your system memory swap cached.

Your WT cache contains about 25GB, or 80% at basically all times. However, your ss mem resident is fluctuating between 35GB and 44GB alongside instances of memory fragmentation. You could consider experimenting with swappiness or trying tcmallocAggressiveMemoryDecommit to aggressively return the free pages to the OS where they can be reused by tcmalloc to satisfy new memory requests:

db.adminCommand( { setParameter: 1, tcmallocAggressiveMemoryDecommit: 1 } )

In general, there is a cyclic pattern of memory fragmentation occurring every few hours it seems which could be contributing to this issue as pages are loaded from disk to cache. There could be other possibilities however.

MongoDB 4.0 reached end of life in April 2022 and is no longer supported, so I'm going to close this ticket for now.  If you suspect a bug in 4.2+ however, we would be interested in investigating it here in the SERVER project. I recommend consulting the MongoDB Developer Community Forums for additional information in the meantime. 

Regards,

Christopher

 

 

Comment by Chris Kelly [ 15/Sep/22 ]

Hello,

We have received your files and will update the ticket when we have further information.

Christopher

Comment by Sorint Lab [ 15/Sep/22 ]

Hello Chris,

can you confirm you received the files?

Is there any update?

 

Thank you.

Comment by Sorint Lab [ 14/Sep/22 ]

Hello,

we attached the files requested.

The issue has been occurring for several weeks but it intensified during the last days, so we provided you with logs from yesterday and today.

 

Thank you.

Comment by Chris Kelly [ 13/Sep/22 ]

Hello,

I've created a secure upload portal for you. Files uploaded to this portal are hosted on Box, are visible only to MongoDB employees, and are routinely deleted after some time.

For each node in the replica set spanning a time period that includes the incident, would you please archive (tar or zip) and upload to that link:

  • the mongod logs
  • the $dbpath/diagnostic.data directory (the contents are described here)

Regards,
Christopher

Generated at Thu Feb 08 06:14:16 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.