[SERVER-43632] Possible memory leak in 4.0 Created: 25/Sep/19 Updated: 15/Nov/21 Resolved: 12/Dec/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Networking |
| Affects Version/s: | 4.0.11 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Alexander Ivanes | Assignee: | Benjamin Caimano (Inactive) |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Linux 4.15.0-55-generic #60~16.04.2-Ubuntu SMP Thu Jul 4 09:03:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux db version v4.0.11 Intel(R) Xeon(R) CPU E3-1230 v3 @ 3.30GHz total used free shared buff/cache available |
||
| Attachments: |
|
||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||
| Sprint: | Repl 2019-10-21, Service Arch 2019-11-18, Service Arch 2019-12-02, Service Arch 2019-12-16 | ||||||||||||||||||||
| Participants: | |||||||||||||||||||||
| Case: | (copied to CRM) | ||||||||||||||||||||
| Description |
|
Good day. After upgrading to 4.0 we've encountered some sort of memory leak in 2 cases: 1) In cluster environment described in 2) Recently in replica set configuration (i attached mongodb memory graphs from two servers (each one was primary at some time), server logs, diagnostic.data and configuration file). Spike around 2-3pm 19.09 on db1-1 was restoration of 'drive' database with mongorestore. Also i did a rs.stepdown() and restart server intance db1-1 at 4:30pm 21.09 due to memory pressure (memory leak has moved to the new primary). At 11:25pm 21.09 we've disabled most of the processes that work with that replica set. After i've started workload again i cannot reproduce memory leak. On the contrary you can see resident memory decreases on db1-2 from 3pm 23.09 and up to date) |
| Comments |
| Comment by Benjamin Caimano (Inactive) [ 09/Dec/19 ] | |
|
vergil, I was able to reproduce the issue by repeatedly restarting a secondary inside one of our test harnesses. I believe that this memory accumulation bug is fixed with | |
| Comment by Alexander Ivanes [ 12/Nov/19 ] | |
|
ben.caimano, thank you for an update. We look forward to this. | |
| Comment by Benjamin Caimano (Inactive) [ 11/Nov/19 ] | |
|
vergil, we think that perhaps you've run into a memory accumulation bug in the networking layer that we introduced in v4.0 and fixed in v4.2. While the v4.2 commit itself is not exactly something we can backport, we believe that we can chunk out the necessary pieces and reapply them to v4.0. I'm tracking this effort with | |
| Comment by Siyuan Zhou [ 17/Oct/19 ] | |
|
matthew.russotto, I thought when restarting heartbeats, we always cancel existing ones since we always track them. Could you please post a link to the code where it behaves unexpectedly? | |
| Comment by Matthew Russotto [ 17/Oct/19 ] | |
|
It appears we're accumulating the heartbeat callback lambda until we reconnect. | |
| Comment by Alexander Ivanes [ 30/Sep/19 ] | |
|
Thanks for an explanation.
Regarding scenario #1: looks like i've disabled slaveDelay because of So it's just an 4th member of replica set (hidden).
| |
| Comment by Danny Hatcher (Inactive) [ 30/Sep/19 ] | |
|
Thanks! Your scenario #2 is a known issue in MongoDB. As each query adds a field, the query shape changes. Thus, we need to go through the query planning stage to determine what index to use and we store that info in our query plan cache. As you've noticed, that plan cache can take up significant system memory if you're constantly adding larger and larger queries. It is not normally an issue except in cases like this where you had an application bug always creating larger queries. We have several tickets in our backlog to address this problem; most notably Regarding scenario #1, can you please confirm how many nodes you expect to see in the replica set and which ones are delayed? From the data I have, there are four nodes configured but none of them have a delay set. | |
| Comment by Alexander Ivanes [ 30/Sep/19 ] | |
|
Yes, sorry. Case #1: Dorado1 – machine name, driveFS-18 – mongod instance name. Log time for driveFS-18 in UTC+0, graphs – UTC+3. Case #2: Logs and graphs in UTC+3, all files with name drive1-db1*.
Again, sorry for the difference in timezones, for driveFS-18 i was forced to convert logs from our logging system. (For drive1-db1 i just took log file as you recommended earlier).
Update: all time marks in previous message was in UTC+3. | |
| Comment by Danny Hatcher (Inactive) [ 30/Sep/19 ] | |
|
vergil, can you please clarify which file names apply to which scenario? Also, are all the times you mention in UTC? I want to make sure we're on the same page. | |
| Comment by Alexander Ivanes [ 29/Sep/19 ] | |
|
I've successfully reproduced both cases. 1. After shutdown of all slave delayed shards (around 2-3 pm 27.09) memory usage goes up. I've attached diagnostic.data, logs, memory chart (both server and mongod instance, this behavior was on all 23 instances of mongod on this machine). Heap profiler was enabled on that instance. I returned slave delayed instance back around 3 am 29.09. 2. This one was a little bit tricky. It turns out there was a bug in our code that monotonically increase the size of the search query. We've fixed it some time ago, so i've asked to return the bug so we can try to reproduce this behavior: each consecutive query adds another
part and memory usage grows with each query. After some time query can be relatively large (about 90KB). I don't know if this is an intended behavior or not, so please look at attached logs. We've published 'bugged' version at 5:30 pm 27.09 and unpublished it at 10 am next day. | |
| Comment by Danny Hatcher (Inactive) [ 26/Sep/19 ] | |
|
Thanks for the clarifications. I believe we'll have to wait for the heap profiler output to obtain any more useful information so I will put this ticket into Waiting status until then. | |
| Comment by Alexander Ivanes [ 26/Sep/19 ] | |
|
Thanks. I've misread and uploaded file to your portal too, sorry
| |
| Comment by Danny Hatcher (Inactive) [ 26/Sep/19 ] | |
|
Apologies, I meant to set up an upload portal for you to use. You can upload any future files relevant to this case via this link. Only MongoDB engineers will be able to access them and they will automatically be deleted after 90 days. I've grabbed the log file link you just sent and uploaded it to our portal so feel free to remove it from your S3. | |
| Comment by Alexander Ivanes [ 26/Sep/19 ] | |
|
Quick update for the first case: memory leak happened on all shards in sharded cluster, primary and secondary. I've attached memory graph from one of the shards and shared the link to the log file (notify me then you no longer need one so i can remove the file from S3 storage). Make note that time in the log in UTC+0, but graph shows UTC+3. We've shutdowned servers around 6:30 am 18.08 (9:30 am on the graph) | |
| Comment by Alexander Ivanes [ 25/Sep/19 ] | |
|
Daniel, thanks for the info.
I will clarify that in that case memory leak appeared after there was unreachable 4th server in the replica set (at first i was thinking that was the leak in logging processor in mongod, because of significantly increased number of messages). I will check tomorrow if it was only on primary or at secondaries too.
Fortunately this is a test instance of DB, so i can easily enable heap profiling setting. I also will try to recreate workload as close as possible. | |
| Comment by Danny Hatcher (Inactive) [ 25/Sep/19 ] | |
|
For the first case you mentioned, we haven't seen evidence of a memory leak in For the second case, I do see the long rise in resident memory on "rainbow1-16". Unfortunately, there's not enough information in the data we have now as to what caused it. I recommend that you enable the following on the nodes in the environment that you think is most likely to experience the rise in memory again:
I am not sure how your logging facility will parse the information so it would be very useful if you could write to a file instead. Additionally, there will be a performance hit to enabling this feature so I recommend only leaving it on while we are troubleshooting. If you do see a rise in memory on a node with this setting enabled, please provide the full "diagnostic.data" and mongod logs for the relevant node. |