[SERVER-574] Detrimental performance when paging (need to reduce concurrency, use madvise and mincore) Created: 26/Jan/10 Updated: 06/Dec/22 Resolved: 14/Sep/18 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | MMAPv1, Performance |
| Affects Version/s: | 1.3.1 |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Roger Binns | Assignee: | Backlog - Storage Execution Team |
| Resolution: | Won't Fix | Votes: | 38 |
| Labels: | mmapv1 | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Ubuntu 9.10, 64 bit |
||
| Issue Links: |
|
||||
| Assigned Teams: |
Storage Execution
|
||||
| Participants: | |||||
| Description |
|
If the memory mapped pages of a file are not present in RAM then a page fault is taken to get the contents. If there is concurrency and the requests do not have much in common then the situation gets worse as many (random) different areas will be taking page faults which causes lots of disk accesses including random seeks dropping throughput considerably. As this slows down completion of all executing requests, it increases the chance of another request coming in and if that starts executing, it will make things even worse. What you end up seeing is an essentially idle CPU, the I/O subsystem at 100% capacity and hard disks seeking their hearts out. The consequence is that when MongoDB hits this capacity limit, performance falls off a cliff. There are several things that can be done to correct this:
|
| Comments |
| Comment by Roger Binns [ 08/Oct/10 ] |
|
@Matthias: While just watching query completion times will help, the problem with that is that throttling will affect all queries including those that could have been served out of memory immediately. For example lets say that half of queries are over the same range of data which is consequently in memory, and the other half are very random. Throttling will affect both whereas only the random ones need to be throttled. Taking advantage of operating system calls in order to be smart about throttling is a good thing. |
| Comment by Matthias Götzke [ 08/Oct/10 ] |
|
it might be possible to achieve the same warning by looking at deviations of query time over the last x seconds/minutes. this way os specific functions would not be needed (especially since they might be difficult to work into the access code). e.g. have x concurrent workers, watch speed over last x queries or seconds. do it again after x seconds or after deviation detected limit x to be within min/max it would be a similar auto-detection mechanism as used for indices, optimizing for best worker thread size automatically |
| Comment by Roger Binns [ 27/Jan/10 ] |
|
In a benchmark run, going from 5 concurrent worker processes to 3 decreased run time from 10h1m to 7h12m. The concurrency was killing performance! Conversely with CouchDB 5 workers took 8h1m and 3 took 10h. |