[SERVER-11840] groupcommitwithlimitedlocks doesn't work Created: 23/Nov/13 Updated: 03/Mar/15 Resolved: 26/Feb/15 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Concurrency |
| Affects Version/s: | 2.5.4 |
| Fix Version/s: | 3.0.0-rc7 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Dwight Merriman | Assignee: | Kaloian Manassiev |
| Resolution: | Done | Votes: | 1 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Backwards Compatibility: | Fully Compatible |
| Operating System: | ALL |
| Participants: |
| Description |
|
groupCommitWithLimitedLocks() isn't really letting other stuff run in parallel, it appears, as there will be contention on groupCommitMutex even though it's not locking the db's themselves. I will look into it. |
| Comments |
| Comment by Kaloian Manassiev [ 26/Feb/15 ] |
|
Fixed in f346c4ad0c2d892014c79d7adbbca61031f50194. |
| Comment by François Doray [ 26/Feb/15 ] |
|
Pull request submitted here https://github.com/mongodb/mongo/pull/929 |
| Comment by François Doray [ 26/Feb/15 ] |
|
I executed many consecutive writes from a simple C++ application (https://github.com/fdoray/trace-kit/blob/master/src/apps/mongowrite.cpp) and measured their duration. The results are here: https://github.com/fdoray/trace-kit/tree/master/results/mongowrite (withoutfix.csv and withoutfix.svg). We can see that at regular intervals, the write operations take more time. Using a critical path analysis(using LTTng), I was able to determine that the slow writes were caused by the fact that groupCommitMutex was held during the call to WRITETOJOURNAL (http://fdoray.github.io/tracecompare/?data=mongowrite). I prepared a fix here: https://github.com/fdoray/mongo/commit/4e5711e571b67a6774ac36abd6bfdb59ee1c913d . The long writes that we had before are no longer there (withfix.csv and withfix.svg). (Note that there is still a few long writes, but recording a trace with LTTng shows that they are caused by dirty pages being written to disk and not by lock contention). I will check that my assumptions about how locking work within MongoDB are correct and send a pull request as soon as possible. |
| Comment by Dwight Merriman [ 26/Nov/13 ] |
|
more detail: if a writer tries to get groupCommitMutex in the old code (which it will), it will be blocked until the commit finishes completely. that will then block even readers (although greedy behavior may be problematic thereof regardless). if fixed, you can then also write during the WRITETOJOURNAL and WRITETODATAFILES phases of a LimitedLocks commit. |