[SERVER-6789] Improve performance of flushing data files when using lots of databases/data files Created: 17/Aug/12 Updated: 10/Dec/14 Resolved: 22/Nov/13 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Performance |
| Affects Version/s: | 2.0.7, 2.2.0-rc1 |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Spencer Brody (Inactive) | Assignee: | Unassigned |
| Resolution: | Duplicate | Votes: | 2 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||
| Participants: | |||||||||||||||||
| Description |
|
The _flushAll method in mmap.cpp does an n^2 iteration of all the datafiles to sync them to disk (it only calls flush once per file, however, which is good). It also acquires and releases the files lock between flushing each file. When running with very large numbers of data files (usually from having a large number of databases), this can cause the flush to take a long time even if there isn't much data to flush. |
| Comments |
| Comment by Daniel Pasette (Inactive) [ 22/Nov/13 ] |
| Comment by Bruce Lucas (Inactive) [ 19/Nov/13 ] |
|
In addition it uses a tree set which adds an extra factor of log n; just changing that to a hash set (unordered_set) without touching the n^2 loops reduces the time by a factor of 4 for 5000 databases. |
| Comment by Christopher Price [ 07/Apr/13 ] |
|
Does this potentially block or slow down reads? If not, what is the negative impact to a system when this runs for a very long time? |