[SERVER-12401] Improve the memory-mapped files flush on Windows Created: 17/Jan/14 Updated: 01/May/18 Resolved: 01/May/18 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | MMAPv1, Performance, Storage |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Alexander Komyagin | Assignee: | DO NOT USE - Backlog - Platform Team |
| Resolution: | Won't Fix | Votes: | 3 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Windows, high-latency network storage |
||
| Issue Links: |
|
||||||||||||||||
| Sprint: | Server 2.7.5, Server 2.7.6 | ||||||||||||||||
| Participants: | |||||||||||||||||
| Description |
|
On Windows, Memory Mapped File flushes are synchronous operations. When the OS Virtual Memory Manager is asked to flush a memory mapped file, it makes a synchronous write request to the file cache manager in the OS. This causes large I/O stalls on Windows systems with high Disk IO latency, while on Linux the same writes are asynchronous. The problem becomes critical on high-latency disk drives like Azure persistent storage (10ms). This behavior results in very long bg flush times, capping disk IOPS at 100. On low latency storage (local storage and AWS) the problem is not that visible. In the code after the FlushViewOfFile returns, we call FlushFileBuffers to ensure that the drive has actually written the change to disk. It's easy to see that in Windows performance monitor disk queue never goes above 1 during the flush. There are several possible improvements we can make to improve the situation on Windows: Option #2 delivers the best performance but requires Microsoft to deliver a OS patch to fix the behavior of Memory Mapped flush on Windows. While the performance of a preliminary patch looks promising, the ETA for released publicly available patch is approximately Q3 2014 calendar year. Our proposal:
|
| Comments |
| Comment by Matt Lord (Inactive) [ 01/May/18 ] |
|
This only affects the mmap storage engine, which has been deprecated in MongoDB 3.7. |
| Comment by Mark Benvenuto [ 02/Sep/15 ] |
|
This is not relevant to the WiredTiger storage engine since WiredTiger does not use memory mapped files. WiredTiger does explicit management of reads and writes to its page cache, and does not rely on the OS Memory Management layer to write changes to disk like MMapV1 does. |
| Comment by Christof Rudolph [ 02/Sep/15 ] |
|
Is it possible that this issue is also relevant for WiredTiger setups? |