[SERVER-40703] Resource leakage + slowness after creating and deleting few thousands of DBs Created: 17/Apr/19 Updated: 06/Dec/22 Resolved: 26/Jun/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | 3.6.11 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | assaf-oren | Assignee: | Backlog - Storage Engines Team |
| Resolution: | Done | Votes: | 1 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Windows & Linux |
||
| Attachments: |
|
||||||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||||||
| Assigned Teams: |
Storage Engines
|
||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||
| Story Points: | 5 | ||||||||||||||||||||||||
| Description |
|
Creating new DB + collections + indexes and deleting it becomes slow after few thousands of cycles. The mongo process memory keeps growing although the DBs are deleted so only a single DB is present at any given time. At some point this cycle become more and more slower. It seems around the memory consumption of ~3.5GB. After stopping the script, the mongo process keeps consuming large amount of CPU and Memory (forever).
See a simple script to reproduce the issue: https://github.com/assaf-xm/mongodb-create-drop-dbs/blob/master/test.js
And results: https://github.com/assaf-xm/mongodb-create-drop-dbs/blob/master/3.6.11-in-memory.png https://github.com/assaf-xm/mongodb-create-drop-dbs/blob/master/3.6.11-regular.png
At start timings are fast - 201ms: 2019-04-17T12:11:25.645Z start 3
But after 8800 cycles DB operations are much slower - 685ms: 2019-04-17T12:41:20.397Z start 8800
|
| Comments |
| Comment by Brian Lane [ 26/Jun/20 ] | |||||||||||||||||||||||||||
|
Thanks assaf@xmcyber.com - great to hear. -Brian | |||||||||||||||||||||||||||
| Comment by assaf-oren [ 25/Jun/20 ] | |||||||||||||||||||||||||||
|
Following our latest conversation with Kian Gogan, I checked this with mongo 4.2.3 and the issue doesn't seem to reproduce. Therefore I think we can close this ticket. | |||||||||||||||||||||||||||
| Comment by assaf-oren [ 17/Feb/20 ] | |||||||||||||||||||||||||||
|
I can see that this fix was delayed. Is there any workaround that can be done using commands to prevent restarting the DB when the memory gets high (and performance gets low)?
| |||||||||||||||||||||||||||
| Comment by Bruce Lucas (Inactive) [ 25/Apr/19 ] | |||||||||||||||||||||||||||
|
The issue reproduces on 4.1.10, and with WT (not just in-memory). Here's a run on 4.1.10, standalone, WT, cache size 0.2 GB, with the heap profiler enabled:
When the cache reached 95% the "bytes not belonging to page images" were finally freed (evicted?), and based on the behavior of the heap profile these seem like likely contributors:
This event however did not delete data handles, and application performance impact due to "dhandle lock application thread time waiting" continued growing. | |||||||||||||||||||||||||||
| Comment by Bruce Lucas (Inactive) [ 25/Apr/19 ] | |||||||||||||||||||||||||||
|
Simple shell-based repro:
Following run uses the in-memory storage engine, 3.6.11, running standalone, with a cache size (--inMemorySizeGB) of 1 GB.
| |||||||||||||||||||||||||||
| Comment by assaf-oren [ 20/Apr/19 ] | |||||||||||||||||||||||||||
|
Also uploaded the diagnostic data of the last few days to the secure upload portal. | |||||||||||||||||||||||||||
| Comment by assaf-oren [ 20/Apr/19 ] | |||||||||||||||||||||||||||
|
Hi,
I did one more run of the script letting it run for hours (30000 cycles). Used disk storage. See the file '3.6.11-30000_cycles-disk_storage-WiredTiger.png' As you can see the latency starts to grow after ~14000 cycles, climbing up to 1100ms, but it drop from time to time back to ~200ms.
Thanks, Assaf | |||||||||||||||||||||||||||
| Comment by Eric Sedor [ 19/Apr/19 ] | |||||||||||||||||||||||||||
|
Yes Assaf, you can use this secure upload portal. Files uploaded here are visible only to MongoDB employees and deleted after some time. | |||||||||||||||||||||||||||
| Comment by assaf-oren [ 19/Apr/19 ] | |||||||||||||||||||||||||||
|
Hi Eric, For your question, the memory growth was seen both for the in-memory and disk storage engines. In the in-memory it is faster to reproduce the point in which a cycle (creating and delete DB and collections) latency starts to grow as well. When it starts to grow it continue to grow as you can see from the graphs and DB CPU will continue to be high even when the script is stopped. (for the in-memory the creation and deletion of DBs are ~7 times faster, so the issue can be faster reproduced) I will run the script again and provide you with the diagnostic.data, can you provide a secured private location for these files? Thanks, Assaf | |||||||||||||||||||||||||||
| Comment by Eric Sedor [ 18/Apr/19 ] | |||||||||||||||||||||||||||
|
Thanks for your report assaf-oren. Can you clarify: Are you reporting this unexpected memory growth using the in-memory storage engine? We are testing your script internally but it will also be helpful if you can archive (tar or zip) the $dbpath/diagnostic.data directory for your test cluster (described here) and attach it to this ticket. |