[SERVER-29830] Fix async accept performance in TransportLayerASIO Created: 23/Jun/17 Updated: 30/Nov/23 Resolved: 08/Sep/17 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Networking |
| Affects Version/s: | None |
| Fix Version/s: | 3.5.13 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Jonathan Reams | Assignee: | Jonathan Reams |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||
| Operating System: | ALL | ||||||||||||
| Sprint: | Platforms 2017-07-31, Platforms 2017-08-21, Platforms 2017-09-11 | ||||||||||||
| Participants: | |||||||||||||
| Linked BF Score: | 0 | ||||||||||||
| Description |
|
Using async_accepts causes a performance hit in TransportLayerASIO, even if the rest of the TransportLayer is being used synchronously. Although we should investigate why the perf regression occurs, we should also fix synchronous accept. |
| Comments |
| Comment by Ramon Fernandez Marina [ 08/Sep/17 ] |
|
Author: {'username': u'jbreams', 'name': u'Jonathan Reams', 'email': u'jbreams@mongodb.com'}Message: |
| Comment by Jonathan Reams [ 30/Aug/17 ] |
|
I've filed a pull request with the upstream project to fix this. |
| Comment by Jonathan Reams [ 29/Aug/17 ] |
|
I dug into this and I think I've figured out what's going on. When a socket is accepted via async_accept it gets registered with epoll even if no async operations ever happen on it. That means that epoll_wait is returning and waking up the listener thread every time a message arrives on a client thread - the listener goes right back to sleep, but the context switch kills perf in the microbenchmarks when there are more client threads than available cores. I changed ASIO so that sockets only get registered with epoll when their first async operation occurs and I'm seeing a dramatic perf increase in the microbenchmarks (at least in the insert tests, other tests still running) - https://evergreen.mongodb.com/task/performance_linux_mmap_standalone_insert_patch_2d7aef7ca03e3a7153f4e2856ea96a862d88e1de_59a58dd6e3c3317cab00082f_17_08_29_15_53_07. |
| Comment by Henrik Ingo (Inactive) [ 16/Aug/17 ] |
|
acm It seems you've done your patch on whatever was the tip of master when you have started working on it. Unfortunately there seems to be some unrelated fluctuation in mmapv1 inserts recently, and the commit you have happened to choose as base is a low point in the recent history. This makes the patch build look good, but I'm afraid it is just random luck. It would be better if you could apply your patch directly on the failed commit, which is 23448. Unfortunately due to EVG-727, such a patch build at this date will fail the compile phase, since it will compile against wrong version of enterprise modules. However, since this is a regression affecting mmapv1 engine, you can simply remove the enterprise modules and not run the inMemory variant. I think removing these lines would work: In the short term though, we can also compare your above result to those old commits. Unfortunately I can't link to such a display, but:
e394a1740ea and b05d8d2 are the Evergreen data points after and before the regression (23448), respectively. You should now see that your patch build results are actually below both of those. Which isn't really helpful, but also not encouraging. (The reason your results are also below e394a1740 are of course very likely completely unrelated to this issue, and probably due to random fluctuation between June and now.) |
| Comment by Githook User [ 15/Aug/17 ] |
|
Author: {'username': 'acmorrow', 'email': 'acm@mongodb.com', 'name': 'Andrew Morrow'}Message: |