[SERVER-574] Detrimental performance when paging (need to reduce concurrency, use madvise and mincore) Created: 26/Jan/10  Updated: 06/Dec/22  Resolved: 14/Sep/18

Status: Closed
Project: Core Server
Component/s: MMAPv1, Performance
Affects Version/s: 1.3.1
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Roger Binns Assignee: Backlog - Storage Execution Team
Resolution: Won't Fix Votes: 38
Labels: mmapv1
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Ubuntu 9.10, 64 bit


Issue Links:
Depends
Assigned Teams:
Storage Execution
Participants:

 Description   

If the memory mapped pages of a file are not present in RAM then a page fault is taken to get the contents. If there is concurrency and the requests do not have much in common then the situation gets worse as many (random) different areas will be taking page faults which causes lots of disk accesses including random seeks dropping throughput considerably. As this slows down completion of all executing requests, it increases the chance of another request coming in and if that starts executing, it will make things even worse. What you end up seeing is an essentially idle CPU, the I/O subsystem at 100% capacity and hard disks seeking their hearts out.

The consequence is that when MongoDB hits this capacity limit, performance falls off a cliff. There are several things that can be done to correct this:

  • Reduce concurrency as saturation is approached to let requests complete quicker instead of having lots of slow very long running requests
  • Under POSIX the madvise system call can be used. For example if an index or data is being sequentially read you could use madvise MADV_SEQUENTIAL|MADV_WILLNEED to suggest the kernel fill those pages in. You can use MADV_DONTNEED on pages that won't be needed again in the near future, as that will help the kernel determine which pages can be evicted to make space for new ones.
  • You can use the mincore system call to determine if a page fault will be taken for a memory range. This is probably the best test for available concurrency (ie throttle how often you proceed when it returns false)


 Comments   
Comment by Roger Binns [ 08/Oct/10 ]

@Matthias: While just watching query completion times will help, the problem with that is that throttling will affect all queries including those that could have been served out of memory immediately.

For example lets say that half of queries are over the same range of data which is consequently in memory, and the other half are very random. Throttling will affect both whereas only the random ones need to be throttled.

Taking advantage of operating system calls in order to be smart about throttling is a good thing.

Comment by Matthias Götzke [ 08/Oct/10 ]

it might be possible to achieve the same warning by looking at deviations of query time over the last x seconds/minutes. this way os specific functions would not be needed (especially since they might be difficult to work into the access code).

e.g.

have x concurrent workers, watch speed over last x queries or seconds.
try with x-1 -> compare
try with x+1 -> compare
adjust x to best speed

do it again after x seconds or after deviation detected

limit x to be within min/max

it would be a similar auto-detection mechanism as used for indices, optimizing for best worker thread size automatically

Comment by Roger Binns [ 27/Jan/10 ]

In a benchmark run, going from 5 concurrent worker processes to 3 decreased run time from 10h1m to 7h12m. The concurrency was killing performance! Conversely with CouchDB 5 workers took 8h1m and 3 took 10h.

Generated at Thu Feb 08 02:54:33 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.