[SERVER-23798] Increased ns file IO in 3.0 Created: 19/Apr/16 Updated: 06/Dec/22 Resolved: 14/Sep/18 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | MMAPv1 |
| Affects Version/s: | 3.0.9 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Greg Murphy | Assignee: | Backlog - Storage Execution Team |
| Resolution: | Won't Fix | Votes: | 0 |
| Labels: | mmapv1 | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||
| Assigned Teams: |
Storage Execution
|
||||||||||||
| Operating System: | ALL | ||||||||||||
| Steps To Reproduce: | Create a MongoDB 2.6 instance using MMAPv1 with enough databases that the cumulative size of their ns files is greater than available physical memory on the server. Monitor the filesystem cache usage and disk IO on the server. Upgrade this server to MongoDB 3.0 (still using MMAPv1) and monitor the same metrics. |
||||||||||||
| Participants: | |||||||||||||
| Description |
|
Following upgrades from 2.6.9 to 3.0.9 (still using MMAPv1) we noticed significantly higher disk IO against the volume hosting MongoDB's data files. This has become particularly apparent on replica sets with large numbers of databases (multiple thousands). From investigation, this appears to be caused by a change in MongoDB's behaviour when reading ns files. To give a precise example, we have a replica set that is currently in the process of being upgraded. It has 3 x 2.6.9 nodes and 1 x 3.0.9 node (hidden, non-voting). The replica set has 5570 databases and uses the 16MB default ns size. If MongoDB loaded all of these ns files into memory, it would require 87GB of memory. The existing 2.6.9 nodes run comfortably as EC2 r3.larges (14GB RAM), and running vmtouch shows that only a tiny percentage of the pages of the ns files are loaded into the filesystem cache:
However, running the 3.0.9 node as an r3.large makes it unusable, as the filesystem cache is constantly flooded with the ns files (and the server takes 1hr 26 mins to start):
The server is then constantly performing significant amounts of read IO, I presume to keep trying to retain the entire contents of the ns files in memory:
Changing the instance type to an r3.4xlarge (122GB) alleviates the problem, as there is now enough memory for all of the ns files to be constantly loaded (and the server starts in 35 minutes with the IO subsystem being the limiting factor):
This isn't a feasible option for us though, as the cost of one of the r3.4xlarge instances is $1,102 for a 31 day month compared to $137 for an r3.large instance. (And clearly across a 3-node replica set this is a lot of money). |
| Comments |
| Comment by Abhishek Amberkar [ 22/Nov/17 ] |
|
Thank you Kelsey, Setting smaller --nssize fixed the issue for us. |
| Comment by Kelsey Schubert [ 19/May/17 ] |
|
Hi abhishek.amberkar and gregmurphy, The 'Backlog' fixVersion indicates that this issue is not currently scheduled for an upcoming release. We understand the impact of this behavior on your deployments, and have discussed this issue internally. Depending on your schema design, you may not need the default 16MB namespace and would benefit from calculating a smaller --nssize to mitigate the impact of this issue. Kind regards, |
| Comment by Greg Murphy [ 02/May/17 ] |
|
I'm afraid I'm just the user who reported it, so don't have any insight into when it will be resolved. |
| Comment by Abhishek Amberkar [ 02/May/17 ] |
|
@Thomas, @Greg Is there any positive development on this issue? |
| Comment by Abhishek Amberkar [ 06/Mar/17 ] |
|
Hi Thomas, Is this issue being worked upon still? |
| Comment by Kelsey Schubert [ 03/Jan/17 ] |
|
Hi abhishek.amberkar and raghusiddarth, Unfortunately, we were not able to complete this work for it to be included in MongoDB 3.4. We're currently in the planning phase for our next major release and will update this ticket's fixVersion as part of this process. Kind regards, |
| Comment by Abhishek Amberkar [ 15/Dec/16 ] |
|
Hi Thomas, Is there any update on this issue? |
| Comment by Raghu Udiyar [ 09/Nov/16 ] |
|
Hi Thomas, Can you let us know the status on this? I see that mongo 3.4 has been released, does that release address this issue? |
| Comment by Kelsey Schubert [ 02/May/16 ] |
|
Hi gregmurphy, Sorry for the silence, I have reproduced this issue and observed that with a large number of databases on MMAPv1 MongoDB starts up slower in 3.2 and 3.0 than 2.6. Please continue to watch this ticket for updates. Kind regards, |
| Comment by Greg Murphy [ 02/May/16 ] |
|
Hopefully this ticket being put in the 3.3 backlog means the issue has been reproduced. To reiterate (and hopefully increase priority), the combination of this issue and the one I've raised in I believe this to be a significant area of concern regarding MongoDB's scalability. Of course if MongoDB isn't designed to support this kind of workload there should be documentation to make users aware that there is a limit to the amount of collections/indexes that can be created when using WiredTiger, and that there is a significant memory impact when running a large amount of databases when using MMAPv1. |