[SERVER-24631] TTL Monitor performance degradation on MongoDB 3.0 Created: 17/Jun/16 Updated: 30/Jan/17 Resolved: 19/Jul/16 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | MMAPv1 |
| Affects Version/s: | 3.0.12 |
| Fix Version/s: | 3.3.11 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Gregory Banks | Assignee: | Kevin Albertson |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
AWS t2.small |
||
| Attachments: |
|
||||||||||||||||
| Issue Links: |
|
||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||
| Operating System: | ALL | ||||||||||||||||
| Steps To Reproduce: | To reproduce this behavior, you can spin up a t2.small instance on AWS, attach a 33 GB gp2 volume, deploy MongoDB 3.0.12, ensure that dbpath points to a a directory that lives on the attached volume, and, finally, restore the attached dump. You should see read behavior on that volume that resembles the attached graphs and vmstat dump with zero user activity. |
||||||||||||||||
| Sprint: | Integration 17 (07/15/16), Integration 18 (08/05/16) | ||||||||||||||||
| Participants: | |||||||||||||||||
| Description |
|
Hi, I’ve noticed a severe spike in resource utilization leading to degradation in performance on AWS since upgrading to 3.0 as a result of the way the TTLMonitor queries for indexes when using MMAPv1. In 2.6, TTL indexes were collected by querying the system.indexes collection like so:
In 3.0 system.indexes was deprecated and a new abstraction layer between the database and the storage engine was introduced. As a result, namespace file operations are now much less efficient. The issue I am seeing appears to be the result of the following bit of code:
which gets executed by the TTLMonitor via this code path:
As a result, the entire namespace file for every database gets pulled into memory every time the TTLMonitor executes (every 60 seconds by default). Of course, the default namespace file size is only 16MBs, so this really isn’t an issue in the most common case. However, if you want to set up a development environment for a number of users on a single host, you will find yourself scratching your head as to why performance is so bad. It should be noted that performance will be bad regardless of user activity, database size, or the presence of TTL indexes (all of which would only serve to exacerbate the situation). In addition to the development use case, it is possible to run into similar issues with a single database that has many collections/indexes and requires a namespace file larger than the default (up to 2048 MBs). In this case, both the TTLMonito and any command that involves a namespace file scan (e.g., listCollections) will cause issues. To reproduce this behavior, you can spin up a t2.small instance on AWS, attach a 33 GB gp2 volume, deploy MongoDB 3.0.12, ensure that dbpath points to a a directory that lives on the attached volume, and, finally, restore the attached dump. You should see read behavior on that volume that resembles the attached graphs and vmstat dump with zero user activity. At the very least, I think this should be explicitly documented in order to save time/confusion on the part of developers/operations and to help with capacity planning and architecture decisions going forward. Greg |
| Comments |
| Comment by Githook User [ 19/Jul/16 ] |
|
Author: {u'username': u'kevinAlbs', u'name': u'Kevin Albertson', u'email': u'kevin.albertson@10gen.com'}Message: |
| Comment by Gregory Banks [ 23/Jun/16 ] |
|
Hey Thomas, Awesome and no problem Cheers, |
| Comment by Kelsey Schubert [ 21/Jun/16 ] |
|
Hi gregbanks, I've confirmed that this issue affects MongoDB 3.2.7 as well, and I'm marking this ticket to be scheduled. Please continue to watch for updates. Thank you again for the detailed steps to reproduce! |
| Comment by Gregory Banks [ 17/Jun/16 ] |
|
No problem. Please let me know if you need anything else! |
| Comment by Ramon Fernandez Marina [ 17/Jun/16 ] |
|
Thanks for the detailed bug report gregbanks, we're investigating. |