-
Type:
Bug
-
Resolution: Done
-
Priority:
Major - P3
-
None
-
Affects Version/s: 3.2.3, 3.3.1
-
Component/s: Networking
-
None
-
Fully Compatible
-
Windows
-
Platforms 12 (04/01/16), Platforms 13 (04/22/16), Platforms 14 (05/13/16), Platforms 15 (06/03/16), Platforms 16 (06/24/16), Platforms 17 (07/15/16), Platforms 18 (08/05/16), Platforms 2016-08-26, Platforms 2016-09-19
-
0
-
None
-
0
-
None
-
None
-
None
-
None
-
None
-
None
NOTE: I have only tried to reproduce this on Windows DEBUG.
It looks like the ASIO network interface fails to properly reuse the pooled connections and tries to run AsyncStream::connect for each request. Furthermore, it looks like AsyncStream::connect takes 2 seconds to return, which significantly slows down single-threaded scripts such as our JS tests.
I discovered this trying to figure out why find operations block for 2 seconds, but write commands don't. Find operations in 3.2 go through the network interface and write commands still use the legacy client.
After adding some instrumentation to the code to print the connect time, I observed that each find operation executes the connect code, which takes about 2 seconds. Here is an example of the output. Note the lines which say 'Connect completed in 2002 ms'.
> assert.eq(2, st.s0.getDB('TestDB').TestColl.find().itcount()); s20004| 2016-02-18T18:49:42.339-0500 D ASIO [NetworkInterfaceASIO-0] Connecting to kaloianmdesktop:20000 ---> s20004| 2016-02-18T18:49:44.341-0500 D ASIO [NetworkInterfaceASIO-0] Connect completed in 2002 ms d20000| 2016-02-18T18:49:44.341-0500 I NETWORK [initandlisten] connection accepted from 10.4.123.233:64094 #17 (4 connections now open) s20004| 2016-02-18T18:49:44.353-0500 I ASIO [NetworkInterfaceASIO-0] Successfully connected to kaloianmdesktop:20000 > assert.eq(2, st.s0.getDB('TestDB').TestColl.find().itcount()); s20004| 2016-02-18T18:49:47.202-0500 D ASIO [NetworkInterfaceASIO-0] Connecting to kaloianmdesktop:20000 ---> s20004| 2016-02-18T18:49:49.206-0500 D ASIO [NetworkInterfaceASIO-0] Connect completed in 2003 ms d20000| 2016-02-18T18:49:49.206-0500 I NETWORK [initandlisten] connection accepted from 10.4.123.233:64097 #18 (5 connections now open) s20004| 2016-02-18T18:49:49.213-0500 I ASIO [NetworkInterfaceASIO-0] Successfully connected to kaloianmdesktop:20000
Note that both requests are going through NetworkExecutor-0, so opening new connections cannot be attributed to the usage of multiple network executors for scaling.
I believe this is what is causing the slow down in the Windows sharding suites.
- related to
-
SERVER-22077 Improve the runtime of sharding js tests
-
- Closed
-