[SERVER-6789] Improve performance of flushing data files when using lots of databases/data files Created: 17/Aug/12  Updated: 10/Dec/14  Resolved: 22/Nov/13

Status: Closed
Project: Core Server
Component/s: Performance
Affects Version/s: 2.0.7, 2.2.0-rc1
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Spencer Brody (Inactive) Assignee: Unassigned
Resolution: Duplicate Votes: 2
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Duplicate
duplicates SERVER-7973 Mongo should be able to reply to clie... Closed
Related
Participants:

 Description   

The _flushAll method in mmap.cpp does an n^2 iteration of all the datafiles to sync them to disk (it only calls flush once per file, however, which is good). It also acquires and releases the files lock between flushing each file. When running with very large numbers of data files (usually from having a large number of databases), this can cause the flush to take a long time even if there isn't much data to flush.



 Comments   
Comment by Daniel Pasette (Inactive) [ 22/Nov/13 ]

SERVER-7973

Comment by Bruce Lucas (Inactive) [ 19/Nov/13 ]

In addition it uses a tree set which adds an extra factor of log n; just changing that to a hash set (unordered_set) without touching the n^2 loops reduces the time by a factor of 4 for 5000 databases.

Comment by Christopher Price [ 07/Apr/13 ]

Does this potentially block or slow down reads? If not, what is the negative impact to a system when this runs for a very long time?

Generated at Thu Feb 08 03:12:42 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.