Workload with many threads doing simple small point queries shows a large performance regression between 3.5.8 and 3.5.9, and all subsequent 3.6 and 4.0 versions. On a 24-core bare metal machine the regression on one test is about 25%; on a larger 64-core virtualized machine (m3.4xlarge) the regression is about 70%.
A git bisect identifies the following commit as the culprit:
923ad3ba8160f2cd614e1258ef19294bd502af78 is the first bad commit
SERVER-29403 Implement TransportLayerASIO
Profiling identifies a lot of time spent in the following stacks:
This appear to be due to excessive calls to ERR_clear_error from ASIO.