[SERVER-37541] MongoDB Not Returning Free Space to OS Created: 10/Oct/18  Updated: 07/Jun/23  Resolved: 12/Oct/18

Status: Closed
Project: Core Server
Component/s: Performance, WiredTiger
Affects Version/s: 3.2.11
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Rakhi Maheshwari Assignee: Bruce Lucas (Inactive)
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Text File buildInfo.txt     Text File collStatsLocalOplog.txt     Text File getCmdLineOpts.txt     Text File hostInfo.txt     File mongoDebug_07T081018.log     File mongoDebug_08T08102018.log     File mongoDebug_09T08102018.log     File mongoDebug_10T08102018.log     Text File rsStatus.txt     Text File serverStatus.txt    
Issue Links:
Duplicate
duplicates SERVER-33296 Excessive memory usage due to heap fr... Backlog
duplicates SERVER-31417 Improve tcmalloc when decommitting la... Backlog
Operating System: ALL
Participants:

 Description   

MongogetCmdLineOpts.txtrsStatus.txtserverStatus.txtbuildInfo.txthostInfo.txtgetCmdLineOpts.txtcollStatsLocalOplog.txtDB memory usage very high 75%. For Mongodb01. Mongodb02, Mongodb03 Available Memory went down to 48%, 29%, 31% from 67%, 46%, 50% respectively throughout the load testing of 15 days. Are there chances that memory usage goes to 90% and process may kill. Why it is not returning free space to OS??



 Comments   
Comment by Bruce Lucas (Inactive) [ 12/Oct/18 ]

Hi Rakhi,

Thanks for your report and for the serverStatus data. We had asked for the diagnostic.data data because it contains a timeseries of serverStatus over an extended period of time and would give us a clearer picture of the extent and history of the issue than the single point-in-time snapshot of serverStatus.

From that snapshot we have the following information:

current_allocated_bytes        25,264,303,768
heap_size                      60,006,965,248
pageheap_free_bytes            16,066,838,528
pageheap_unmapped_bytes        15,583,764,480
resident                       42,309
virtual                        58,300

  • The mongod process has allocated about 25 GB (current_allocated_bytes), but tcmalloc is using about 60 GB virtual memory (heap_size).
  • Of that only 42 GB is resident, as it has returned about 16 GB (pageheap_unmapped_bytes) of free memory to the o/s. (Note that if the numbers you quoted are referring to virtual memory and not resident memory, then you may be getting an inflated view of the memory usage.)
  • Still it is retaining about 16 GB of free memory (pageheap_free_bytes) that could in theory be returned to the o/s but has not been, so there is a real issue.

There are two issues here:

  • Why is there so much free memory in the tcmalloc page heap? This could be because at some point mongod has allocated and then freed a large amount of memory, or it could be due to fragmentation; we can't distinugish using just a single point-in-time snapshot. If it is fragmentation, that's an issue we are aware of, tracked by SERVER-33296
  • Why is the free memory not returned to the o/s? We have observed that tcmalloc seems to be reluctant to return memory to the o/s, then sometimes suddenly decides to unmap a large amount all at once. In addition to the memory utilization issue you note, this can cause performance problems when tcmalloc returns the memory to the o/s. This issue is tracked by SERVER-31417.

Finally, you may be able to get some relief by setting the environment variable TCMALLOC_AGGRESSIVE_DECOMMIT or equivalently the server parameter tcmallocAggressiveMemoryDecommit, which will cause memory to be returned to the o/s more aggressively. We have not enabled this by default as we have observed it to impact performance in some workloads.

Since I believe that the issues you are seeing are already covered by other SERVER tickets I'll close this ticket as a duplicate; please watch those tickets for updates, and thanks again for reporting this.

Comment by Rakhi Maheshwari [ 11/Oct/18 ]

Nick Brewer Already attached the files showing diagnostic data with the output of serverStatus,hostInfo buildInfo etc commands. This scenerio of memory utilization is seen repeatedly as it goes down form 40% to 29% and even more less. Please find the attached logs in that scenerio.mongoDebug_10T08102018.logmongoDebug_09T08102018.logmongoDebug_08T08102018.logmongoDebug_07T081018.log

Comment by Nick Brewer [ 10/Oct/18 ]

rmaheshwari MongoDB uses the operating system's memory management. If you can upload a mongod log and an archive of the dbpath/diagnostic.data directory from an affected node, we can take a closer look at the resource utilization you're seeing.

-Nick

Generated at Thu Feb 08 04:46:17 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.