[SERVER-31141] Poor performance with 95% cache and many writers Created: 18/Sep/17 Updated: 29/Jan/18 Resolved: 08/Jan/18 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | WiredTiger |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Bob Potter | Assignee: | Alexander Gorrod |
| Resolution: | Done | Votes: | 2 |
| Labels: | MAREF | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
| Backwards Compatibility: | Fully Compatible |
| Operating System: | ALL |
| Participants: |
| Description |
|
On mongo 3.4.9 we are seeing poor performance when the cache hits 95% utilization. I've attached a snippet of mongostat output from a time when this was happening. When this slowdown happens there is almost no disk activity. Some additional information about our environment:
Are there any configuration settings we should try changing to work around this? I can upload the diagnostic.data to the private portal if that would help. |
| Comments |
| Comment by Bob Potter [ 03/Jan/18 ] | ||
|
Great! We are already on 3.6.1 and that issue has gone away. We are now seeing a different performance issue trigger when cache isn't maxed out. I'll open a new ticket for that though once I can repro again and collect diagnostic information. | ||
| Comment by Alexander Gorrod [ 03/Jan/18 ] | ||
|
bpot Sorry for the long delay in getting back to you and thanks for uploading the data. I've taken a look at the data, and the symptoms you see correspond to some that were fixed in In particular your workload has a collection that uses the majority of the cache, and the fix in | ||
| Comment by Bob Potter [ 26/Sep/17 ] | ||
|
I've uploaded logs for about a 10 minute period where we were experiencing the issue. Let me know if there is any additional information which would be helpful. Thanks for looking into this. | ||
| Comment by Alexander Gorrod [ 26/Sep/17 ] | ||
|
bpot Thanks for uploading the logs - we have investigated the data. We haven't seen similar behavior in other workloads - it would help us understand the behavior on your system if you can share additional diagnostic information while the system is in the described state. It's possible that the additional information gathering will have some affect on throughput - so please monitor carefully when enabling. How you enable the logging depends on how you administer your database, the instructions I give below work through the mongodb shell. Enable the additional logging using:
Allow the machine to run for approximately 10 minutes with this setting enabled. Disable the setting by running:
Then please upload the following for the time period corresponding to when the additional logging is enabled:
The additional information enabled here doesn't track application specific content - it purely tracks abstract database operations, so there shouldn't be any privacy issues. | ||
| Comment by Bob Potter [ 19/Sep/17 ] | ||
|
Hi, I don't have the logs from the original time period but I've uploaded redacted logs and diagnostic data for a more recent occurrence. Thanks, | ||
| Comment by Kelsey Schubert [ 18/Sep/17 ] | ||
|
Hi bpot, Thanks for providing the diagnostic.data, we're still examining it to better understand what is going on, but an initial review suggests that it would be helpful if you could provide the complete log files covering the same time period so we can correlate these events. If there is sensitive material in the logs, would you please redact it using the same procedure as in Kind regards, | ||
| Comment by Bob Potter [ 18/Sep/17 ] | ||
|
Thanks. I've uploaded the data. | ||
| Comment by Kelsey Schubert [ 18/Sep/17 ] | ||
|
Hi bpot, I've created a secure upload portal for you to use to provide an archive of the diagnostic.data directory. Thank you, |