[SERVER-50014] Mongos can accumulate ChunkManager of dropped collections no longer being used Created: 30/Jul/20  Updated: 27/Oct/23  Resolved: 08/Jan/21

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 4.2.6
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Stephen Paul Adithela Assignee: Kaloian Manassiev
Resolution: Works as Designed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: PNG File RoutingTableHistory.png     Text File chunks-col-stats.txt     PNG File image-2020-09-13-23-41-29-108.png     PNG File image-2020-09-13-23-48-11-445.png     PNG File normalmongos.png     PNG File problematicmongos.png     Text File serverstatus-ttp01.txt     Zip Archive stacks.html.zip    
Issue Links:
Related
related to SERVER-50944 Add number of chunks and estimated ro... Closed
related to SERVER-49697 Callers should clear their shared_ptr... Blocked
Operating System: ALL
Participants:

 Description   

Mongos don't proactively clear the cached metadata unless it was the mongos that dropped it (or when it tries to perform an operation on a namespace that was already dropped), so cached metadata for dropped collections can stay in the cache forever. This can be problematic for use cases where sharded collections are constantly being created and dropped, as they in theory should net no additional memory required, but will cause mongos to slowly accumulate chunk metadata for old dropped collections.

Original title: mongos memory footprint keeps increasing

Original description

Machine type: Amazon Linux 2

Shard count: 5

Cluster Architecture: PSA

Config nodes: 3

Mongo Version: 4.2.6

 

We have a mongos running on each application server instance. The mongos seems to continuously keep on accumulating memory overtime until the mongos service is restarted. 

I don't see a mongos configuration option (https://docs.mongodb.com/manual/reference/program/mongos/) to control the memory growth beyond a certain threshold. Can it be configurable like wiredTigerCacheSize?

 

This seems like a memory leak issue! Is there a way we can inspect what kind of mongos related data is stored in memory?

 

Attaching the serverStatus output..

 

Thanks,

Stephen



 Comments   
Comment by Kaloian Manassiev [ 08/Jan/21 ]

Hi stephenpaul2727@gmail.com,

Thank you for the detailed report and for suggesting the workaround, and apologies for the delayed response!

In the latest (not yet released version of MongoDB) we did some work to address this problem, by capping the cache of routing information that the routers keep and making that cache an LRU (SERVER-46199). For collections with large number of chunks, this still doesn't probably help, because 10,000 is a large number, but at least there is a cap on the number of collections we will cache.

We have some plans underway to make the memory accounting of these caches better, but it might take some time to implement. In the mean time, I am closing this ticket with the understanding that the workaround is to run the flushRouterConfig.

Best regards,
-Kal.

Comment by Randolph Tan [ 21/Sep/20 ]

I think the workaround should work.

Comment by Stephen Paul Adithela [ 18/Sep/20 ]

Thanks! If I were to create a workaround script to do a count query on all dropped:true collections in config DB (from all mongos's), this should force mongos to access the ns and hence could clear the metadata of dropped collection right? I am trying to avoid restarting mongos and wondering you think this could be possible?

 

Comment by Randolph Tan [ 18/Sep/20 ]

They currently do. Dropped collections stay in config.collections with dropped: true. However, the mongos currently doesn't actively check that the collection is already dropped. They will only learn about it when it attempts to access the namespace and the shard responds back that the collection is dropped.

Comment by Stephen Paul Adithela [ 17/Sep/20 ]

Thanks Randolph, Yes restarting mongos will definitely release all that cached memory. Could the mongo config servers store this drop sharded collection info and then be propogated to rest of mongos nodes?

Comment by Randolph Tan [ 17/Sep/20 ]

If all the mongos accessed the sharded collection in the past, they will have the metadata cached. When the drop is called, only the specific mongos that ran the drop will clear the cache for that collection. The rest of the mongos will be oblivious that the collection has been dropped. Unfortunately, we currently don't have a convenient way to tell these mongos to purge cached metadata for collections that were already dropped, so I'm hesitant to provide a complicated procedure. The only sure fire way to free the memory is to restart the mongos.

Comment by Stephen Paul Adithela [ 17/Sep/20 ]

Thanks randolph for the info. It makes sense. Our java app (using mongo java drivers) is currently dropping and recreating these sharded collections everyday. If what i understand is correct, you are saying if the drop and recreate is directly done on mongo CLI, then cached metadata should get cleaned up right?

Would this cached metadata for chunkManager get distributed across all mongos VM's memory? For example, if we have 10 mongos's running on separate VM's and drop sharded col is issued on all mongos's (only one mongos got success & rest say already dropped), Does this cached metadata for dropped sharded col only get stored on succeeded mongos VM memory or all the mongos's VM's memory?

Thank you!

Comment by Randolph Tan [ 17/Sep/20 ]

I see. Currently, mongos don't proactively cleanup cached metadata for collections that are dropped (unless it was the mongos were the drop was called) or when it tries to perform an operation on that namespace and discovers that it is already dropped. This would explain the slow accumulation of RoutingTable::makeNew objects not being freed.

Comment by Stephen Paul Adithela [ 17/Sep/20 ]

Hi Randolph,
Yes, everyday we have 3-4 sharded collections being created and 3-4 sharded collections getting dropped. Currently, all mongos's are experiencing this constant memory growth pattern

 

Comment by Randolph Tan [ 17/Sep/20 ]

Are sharded collections regularly being created and dropped in this cluster? How many mongos are experiencing this issue vs mongos not experiencing this issue?

Thanks!

Comment by Stephen Paul Adithela [ 15/Sep/20 ]

Hi Dima, Attaching config.chunks stats output 
chunks-col-stats.txt

Comment by Dmitry Agranat [ 15/Sep/20 ]

Hi stephenpaul2727@gmail.com, while we are still investigating, could you provide the size of your config.chunks collection?

Comment by Stephen Paul Adithela [ 14/Sep/20 ]

Hi Dima,

We have 1054829 chunks in our cluster at the moment. 

Distribution per shard:

Comment by Dmitry Agranat [ 13/Sep/20 ]

Hi stephenpaul2727@gmail.com, please hold off with setting tcmallocAggressiveMemoryDecommit as proposed in my last comment, we are still investigating. Could you also clarify how many chunks does this cluster have?

Comment by Dmitry Agranat [ 13/Sep/20 ]

Hi stephenpaul2727@gmail.com, thanks for uploading the requested information.

Unfortunately, some of the stacks which are present in our diagnostic.data, are not present in the logs you've attached (grep output).

I did manage to catch one stack which is related to getMore commands and this might indicate SERVER-36346

mongos.log.2020-08-27T00-00-01:2020-08-26T03:00:43.021+0000 I  -        [ftdc] heapProfile stack881: { 0: "realloc", 1: "mongo::mongoRealloc", 2: "mongo::_BufBuilder<mongo::SharedBufferAllocator>::grow_reallocate", 3: "mongo::replyToQuery", 4: "mongo::Strategy::getMore", 5: "mongo::ServiceEntryPointMongos::handleRequest", 6: "mongo::ServiceStateMachine::_processMessage", 7: "mongo::ServiceStateMachine::_runNextInGuard", 8: "0x56209755032c", 9: "mongo::transport::ServiceExecutorSynchronous::schedule", 10: "mongo::ServiceStateMachine::_scheduleNextWithGuard", 11: "mongo::ServiceStateMachine::_sourceCallback", 12: "mongo::ServiceStateMachine::_sourceMessage", 13: "mongo::ServiceStateMachine::_runNextInGuard", 14: "0x56209755032c", 15: "0x5620978d7a1b", 16: "0x562097fea785", 17: "0x562097fea7e4", 18: "0x7ff704a6f40b", 19: "clone" }

However, given the incomplete logs, I was not able to collect enough stacks which would account for the amount of memory growth we are looking for. Another issue that I saw is a slight memory fragmentation but given a very small memory fluctuations for this workload, it will be rather difficult to determine the cause here.

Apart for the mentioned possible cause as SERVER-36346, I'd like to see if de-commiting memory more frequently would help in this situation by setting:

db.adminCommand({setParameter: 1, tcmallocAggressiveMemoryDecommit: 1})

Thanks,
Dima

Comment by Stephen Paul Adithela [ 07/Sep/20 ]

Hi Dima,

Yes, logs are also uploaded to that same secure portal with heapProfile grepped

Thanks,

Stephen

Comment by Dmitry Agranat [ 06/Sep/20 ]

Thanks stephenpaul2727@gmail.com, could you point me to the data requested in step 6:

From the mongoS logs, extract everything with heapProfile stack and upload it to the same secure portal

If those logs are still available, you can just grep for heapProfile.

Thanks,
Dima

Comment by Stephen Paul Adithela [ 03/Sep/20 ]

Hi Dima,

I have uploaded all the files on secure portal according to the instructions in the previous comment.
Suffix tt --> problematic ones where mongos memory issues are seen

Suffix lla --> non-problematic

Comment by Dmitry Agranat [ 03/Sep/20 ]

Hi stephenpaul2727@gmail.com, did you have a chance to review my last comment?

Comment by Dmitry Agranat [ 23/Aug/20 ]

Hi stephenpaul2727@gmail.com,

Thank you for clarification.

I suggest we'll re-collect the requested data but with some modification.

  1. Increase the diagnostic.data collection from the default 200 MB to 1000 MB by modifying this parameter: diagnosticDataCollectionDirectorySizeMB. This will allow us to have a longer time window we can look back and see the memory utilization. This is required due to a relatively small memory growth and the need to collect data for, perhaps, for more than a week
  2. Make sure log retention is more than a week, let's say, 10 days
  3. Restart mongoS with heapProfilingEnabled=true
  4. Let mongoS run for a week
  5. Save the diagnostic.data and mongoS server logs
  6. Upload the diagnostic.data to provided secure portal. From the mongoS logs, extract everything with heapProfile stack and upload it to the same secure portal

Thanks,
Dima

Comment by Stephen Paul Adithela [ 21/Aug/20 ]

Hi Dima,

Thanks for the reply! our current VM (4core/8GB) has two services running. One is a java application (which takes 60% of total mem) at any time. With GC, the memory upper limit more or less stays the same. mongos starts with less amount of memory but keeps continuously growing. as you have noticed 50MB per day. with 60% already occupied, we can only afford another 20-30% for mongos. But it grows past that point which invokes OOM Killer.

We have the same mongos version (4.2.6) running in another environment (with same java process, same VM specs and same IO rate),  In that environment, mongos memory growth is much more stable. 

On the problematic VM, there is a growth of 5% over the past 7 days. so ~410MB. On the other similar environment, the mongos growth is less than 1-2% for the past 30 days. Attaching the screenshot of memory growth on problematic VM & Normal VM

 

I understand delving deep into this issue might be hard online. I am okay to enable heap profiling again on both these VMs, collect logs and observe for a few days. Please let me know your thoughts. Also open to any other suggestions

 

Many Thanks,

Stephen

Comment by Dmitry Agranat [ 20/Aug/20 ]

Hi stephenpaul2727@gmail.com,

Since the requested logs are no longer available for the covering time we've collected diagnostic.data, we won't be able to grep for specific information you've proposed in your last comment. However, I'd like to step back to better understand the issue you are reporting.

The mongos seems to continuously keep on accumulating memory overtime until the mongos service is restarted.

Given what we saw via the diagnostic.data, the resident memory grows by ~50 MB per day. With 8 GB RAM, it will take 163 days to use all memory.

  • Could you clarify why the mongoS service is being restarted given the trend of 163 days to get to 8 GB?
  • Where do you see that mongoS continuously keep on accumulating memory overtime?
  • What is the memory utilization today?

Just a reminder that memory is being managed by the OS and OS does not "kick out" those memory pages unless new data comes in *and *memory pressure is applied on the older pages. Since I do not see any memory pressure (we hardly use 20% of the total RAM size), I do not expect OS to release memory back to the system.

Thanks,
Dima

Comment by Stephen Paul Adithela [ 19/Aug/20 ]

Hi Dmitry,

Unfortunately, we only have 7 days worth of log retention available. So can't provide mongos logs during that time. But the issue also seems to be happening now. Also the logs have sensitive data because of queries.  Is there a specific thread/task you are interested in? I can exclude COMMAND lines if that is okay?

 

Yes, the resident memory shouldn't be growing by mongos process. we wonder why mongos is trying to reserve that resident memory and not releasing it. 

 

Thanks,

Stephen

Comment by Dmitry Agranat [ 19/Aug/20 ]

Hi stephenpaul2727@gmail.com, we would also need the requested mongoS server logs covering the time when heapProfiling was enabled. Please upload them to the same directory.

Also, from a quick look at the diagnostic.data, I see that the resident memory of mongod process grows by about 50 MB per day. From a couple of days we have data for, this is translated into 1450 MB at the start and 1550 MB after ~2 days. Do these numbers make sense to you and this is what the issue is?

Comment by Stephen Paul Adithela [ 15/Aug/20 ]

Hi Jon, 

I have uploaded the mongos diag tgz file. 

Thanks,

Stephen

Comment by Jonathan Streets (Inactive) [ 13/Aug/20 ]

hi stephenpaul2727@gmail.com,

I've created a secure upload portal for you. Files uploaded to this portal are visible only to MongoDB employees and are routinely deleted after some time.

Jon

Comment by Stephen Paul Adithela [ 11/Aug/20 ]

Hi Jon, is there a mongo DB's private dump location link? (for ftdc data?) could you let me know?

Thanks,

Stephen

Comment by Stephen Paul Adithela [ 03/Aug/20 ]

Thanks jon for the info, I will attach the diagnostic data once we captured memory increase on mongos

Comment by Jonathan Streets (Inactive) [ 31/Jul/20 ]

hi stephenpaul2727@gmail.com,
thank you for the information so far. In order to investigate this issue further we would need to obtain information from the mongos running with the --setParameter heapProfilingEnabled=true option. Importantly, this does impact performance by up to 30% for CPU-bound workloads and is not recommended for production use. So, please consider carefully if you'd prefer to go this route or not.

After the problem reoccurs with heap profiling enabled, you can disable it and upload the $dbpath/diagnostic.data directory, and the mongos server logs to this ticket. The data in that directory is known as the FTDC data, and is described here.

Regards,
Jon

Generated at Thu Feb 08 05:21:28 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.