[SERVER-50014] Mongos can accumulate ChunkManager of dropped collections no longer being used Created: 30/Jul/20 Updated: 27/Oct/23 Resolved: 08/Jan/21 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | 4.2.6 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Stephen Paul Adithela | Assignee: | Kaloian Manassiev |
| Resolution: | Works as Designed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||
| Issue Links: |
|
||||||||||||
| Operating System: | ALL | ||||||||||||
| Participants: | |||||||||||||
| Description |
|
Mongos don't proactively clear the cached metadata unless it was the mongos that dropped it (or when it tries to perform an operation on a namespace that was already dropped), so cached metadata for dropped collections can stay in the cache forever. This can be problematic for use cases where sharded collections are constantly being created and dropped, as they in theory should net no additional memory required, but will cause mongos to slowly accumulate chunk metadata for old dropped collections. Original title: mongos memory footprint keeps increasing Original description
|
| Comments |
| Comment by Kaloian Manassiev [ 08/Jan/21 ] | ||
|
Thank you for the detailed report and for suggesting the workaround, and apologies for the delayed response! In the latest (not yet released version of MongoDB) we did some work to address this problem, by capping the cache of routing information that the routers keep and making that cache an LRU ( We have some plans underway to make the memory accounting of these caches better, but it might take some time to implement. In the mean time, I am closing this ticket with the understanding that the workaround is to run the flushRouterConfig. Best regards, | ||
| Comment by Randolph Tan [ 21/Sep/20 ] | ||
|
I think the workaround should work. | ||
| Comment by Stephen Paul Adithela [ 18/Sep/20 ] | ||
|
Thanks! If I were to create a workaround script to do a count query on all dropped:true collections in config DB (from all mongos's), this should force mongos to access the ns and hence could clear the metadata of dropped collection right? I am trying to avoid restarting mongos and wondering you think this could be possible?
| ||
| Comment by Randolph Tan [ 18/Sep/20 ] | ||
|
They currently do. Dropped collections stay in config.collections with dropped: true. However, the mongos currently doesn't actively check that the collection is already dropped. They will only learn about it when it attempts to access the namespace and the shard responds back that the collection is dropped. | ||
| Comment by Stephen Paul Adithela [ 17/Sep/20 ] | ||
|
Thanks Randolph, Yes restarting mongos will definitely release all that cached memory. Could the mongo config servers store this drop sharded collection info and then be propogated to rest of mongos nodes? | ||
| Comment by Randolph Tan [ 17/Sep/20 ] | ||
|
If all the mongos accessed the sharded collection in the past, they will have the metadata cached. When the drop is called, only the specific mongos that ran the drop will clear the cache for that collection. The rest of the mongos will be oblivious that the collection has been dropped. Unfortunately, we currently don't have a convenient way to tell these mongos to purge cached metadata for collections that were already dropped, so I'm hesitant to provide a complicated procedure. The only sure fire way to free the memory is to restart the mongos. | ||
| Comment by Stephen Paul Adithela [ 17/Sep/20 ] | ||
|
Thanks randolph for the info. It makes sense. Our java app (using mongo java drivers) is currently dropping and recreating these sharded collections everyday. If what i understand is correct, you are saying if the drop and recreate is directly done on mongo CLI, then cached metadata should get cleaned up right? Would this cached metadata for chunkManager get distributed across all mongos VM's memory? For example, if we have 10 mongos's running on separate VM's and drop sharded col is issued on all mongos's (only one mongos got success & rest say already dropped), Does this cached metadata for dropped sharded col only get stored on succeeded mongos VM memory or all the mongos's VM's memory? Thank you! | ||
| Comment by Randolph Tan [ 17/Sep/20 ] | ||
|
I see. Currently, mongos don't proactively cleanup cached metadata for collections that are dropped (unless it was the mongos were the drop was called) or when it tries to perform an operation on that namespace and discovers that it is already dropped. This would explain the slow accumulation of RoutingTable::makeNew objects not being freed. | ||
| Comment by Stephen Paul Adithela [ 17/Sep/20 ] | ||
|
Hi Randolph,
| ||
| Comment by Randolph Tan [ 17/Sep/20 ] | ||
|
Are sharded collections regularly being created and dropped in this cluster? How many mongos are experiencing this issue vs mongos not experiencing this issue? Thanks! | ||
| Comment by Stephen Paul Adithela [ 15/Sep/20 ] | ||
|
Hi Dima, Attaching config.chunks stats output | ||
| Comment by Dmitry Agranat [ 15/Sep/20 ] | ||
|
Hi stephenpaul2727@gmail.com, while we are still investigating, could you provide the size of your config.chunks collection? | ||
| Comment by Stephen Paul Adithela [ 14/Sep/20 ] | ||
|
Hi Dima, We have 1054829 chunks in our cluster at the moment. Distribution per shard: | ||
| Comment by Dmitry Agranat [ 13/Sep/20 ] | ||
|
Hi stephenpaul2727@gmail.com, please hold off with setting tcmallocAggressiveMemoryDecommit as proposed in my last comment, we are still investigating. Could you also clarify how many chunks does this cluster have? | ||
| Comment by Dmitry Agranat [ 13/Sep/20 ] | ||
|
Hi stephenpaul2727@gmail.com, thanks for uploading the requested information. Unfortunately, some of the stacks which are present in our diagnostic.data, are not present in the logs you've attached (grep output). I did manage to catch one stack which is related to getMore commands and this might indicate
However, given the incomplete logs, I was not able to collect enough stacks which would account for the amount of memory growth we are looking for. Another issue that I saw is a slight memory fragmentation but given a very small memory fluctuations for this workload, it will be rather difficult to determine the cause here. Apart for the mentioned possible cause as
Thanks, | ||
| Comment by Stephen Paul Adithela [ 07/Sep/20 ] | ||
|
Hi Dima, Yes, logs are also uploaded to that same secure portal with heapProfile grepped Thanks, Stephen | ||
| Comment by Dmitry Agranat [ 06/Sep/20 ] | ||
|
Thanks stephenpaul2727@gmail.com, could you point me to the data requested in step 6:
If those logs are still available, you can just grep for heapProfile. Thanks, | ||
| Comment by Stephen Paul Adithela [ 03/Sep/20 ] | ||
|
Hi Dima, I have uploaded all the files on secure portal according to the instructions in the previous comment. Suffix lla --> non-problematic | ||
| Comment by Dmitry Agranat [ 03/Sep/20 ] | ||
|
Hi stephenpaul2727@gmail.com, did you have a chance to review my last comment? | ||
| Comment by Dmitry Agranat [ 23/Aug/20 ] | ||
|
Thank you for clarification. I suggest we'll re-collect the requested data but with some modification.
Thanks, | ||
| Comment by Stephen Paul Adithela [ 21/Aug/20 ] | ||
|
Hi Dima, Thanks for the reply! our current VM (4core/8GB) has two services running. One is a java application (which takes 60% of total mem) at any time. With GC, the memory upper limit more or less stays the same. mongos starts with less amount of memory but keeps continuously growing. as you have noticed 50MB per day. with 60% already occupied, we can only afford another 20-30% for mongos. But it grows past that point which invokes OOM Killer. We have the same mongos version (4.2.6) running in another environment (with same java process, same VM specs and same IO rate), In that environment, mongos memory growth is much more stable. On the problematic VM, there is a growth of 5% over the past 7 days. so ~410MB. On the other similar environment, the mongos growth is less than 1-2% for the past 30 days. Attaching the screenshot of memory growth on problematic VM & Normal VM
I understand delving deep into this issue might be hard online. I am okay to enable heap profiling again on both these VMs, collect logs and observe for a few days. Please let me know your thoughts. Also open to any other suggestions
Many Thanks, Stephen | ||
| Comment by Dmitry Agranat [ 20/Aug/20 ] | ||
|
Since the requested logs are no longer available for the covering time we've collected diagnostic.data, we won't be able to grep for specific information you've proposed in your last comment. However, I'd like to step back to better understand the issue you are reporting.
Given what we saw via the diagnostic.data, the resident memory grows by ~50 MB per day. With 8 GB RAM, it will take 163 days to use all memory.
Just a reminder that memory is being managed by the OS and OS does not "kick out" those memory pages unless new data comes in *and *memory pressure is applied on the older pages. Since I do not see any memory pressure (we hardly use 20% of the total RAM size), I do not expect OS to release memory back to the system. Thanks, | ||
| Comment by Stephen Paul Adithela [ 19/Aug/20 ] | ||
|
Hi Dmitry, Unfortunately, we only have 7 days worth of log retention available. So can't provide mongos logs during that time. But the issue also seems to be happening now. Also the logs have sensitive data because of queries. Is there a specific thread/task you are interested in? I can exclude COMMAND lines if that is okay?
Yes, the resident memory shouldn't be growing by mongos process. we wonder why mongos is trying to reserve that resident memory and not releasing it.
Thanks, Stephen | ||
| Comment by Dmitry Agranat [ 19/Aug/20 ] | ||
|
Hi stephenpaul2727@gmail.com, we would also need the requested mongoS server logs covering the time when heapProfiling was enabled. Please upload them to the same directory. Also, from a quick look at the diagnostic.data, I see that the resident memory of mongod process grows by about 50 MB per day. From a couple of days we have data for, this is translated into 1450 MB at the start and 1550 MB after ~2 days. Do these numbers make sense to you and this is what the issue is? | ||
| Comment by Stephen Paul Adithela [ 15/Aug/20 ] | ||
|
Hi Jon, I have uploaded the mongos diag tgz file. Thanks, Stephen | ||
| Comment by Jonathan Streets (Inactive) [ 13/Aug/20 ] | ||
|
I've created a secure upload portal for you. Files uploaded to this portal are visible only to MongoDB employees and are routinely deleted after some time. Jon | ||
| Comment by Stephen Paul Adithela [ 11/Aug/20 ] | ||
|
Hi Jon, is there a mongo DB's private dump location link? (for ftdc data?) could you let me know? Thanks, Stephen | ||
| Comment by Stephen Paul Adithela [ 03/Aug/20 ] | ||
|
Thanks jon for the info, I will attach the diagnostic data once we captured memory increase on mongos | ||
| Comment by Jonathan Streets (Inactive) [ 31/Jul/20 ] | ||
|
hi stephenpaul2727@gmail.com, After the problem reoccurs with heap profiling enabled, you can disable it and upload the $dbpath/diagnostic.data directory, and the mongos server logs to this ticket. The data in that directory is known as the FTDC data, and is described here. Regards, |