Details
-
Bug
-
Resolution: Done
-
Major - P3
-
None
-
3.2.0
-
None
-
Fully Compatible
-
ALL
-
-
Platforms 10 (02/19/16), Platforms 12 (04/01/16), Platforms 13 (04/22/16), Platforms 14 (05/13/16), Platforms 15 (06/03/16), Platforms 16 (06/24/16), Platforms 17 (07/15/16), Platforms 18 (08/05/16), Platforms 2016-08-26, Platforms 2016-09-19, Platforms 2016-10-10, Platforms 2016-11-21, Platforms 2017-01-23
Description
This bug only seems to be reproducible on OSX and is most easily visible when running the 'blackhole' test, where mongobridge drops all requests against the config servers.
From inspecting the logs, it seems that occasionally network requests with timeout do not get timed out and just sit there.
I am attaching the complete logs, but here is the relevant snippets (with verbosity 3 and some extra tracing of the request ids). From the snippet below it can be seen that request 151 never times out, even though it has an expiration time of 30 seconds after the request was scheduled (2015-12-10T16:29:59.426-0500):
[js_test:all_config_servers_blackholed_from_mongos] 2015-12-10T16:29:29.426-0500 s20020| 2015-12-10T16:29:29.425-0500 D SHARDING [conn2] command: NonExistentDB.$cmd { insert: "TestColl", documents: [ { _id: 0.0, value: "This value will never be inserted" } ], ordered: true } ntoreturn: 1 options: 0
|
[js_test:all_config_servers_blackholed_from_mongos] 2015-12-10T16:29:29.426-0500 s20020| 2015-12-10T16:29:29.426-0500 D EXECUTOR [conn2] Scheduling remote command request: RemoteCommand 151 -- target:kaloianm-mac.local:20016 db:config expDate:2015-12-10T16:29:59.426-0500 cmd:{ find: "databases", filter: { _id: "NonExistentDB" }, readConcern: { level: "majority", afterOpTime: { ts: Timestamp 1449782939000|15, t: 1 } }, limit: 1, maxTimeMS: 30000 }
|
[js_test:all_config_servers_blackholed_from_mongos] 2015-12-10T16:29:29.426-0500 s20020| 2015-12-10T16:29:29.426-0500 D ASIO [conn2] startCommand: RemoteCommand 151 -- target:kaloianm-mac.local:20016 db:config expDate:2015-12-10T16:29:59.426-0500 cmd:{ find: "databases", filter: { _id: "NonExistentDB" }, readConcern: { level: "majority", afterOpTime: { ts: Timestamp 1449782939000|15, t: 1 } }, limit: 1, maxTimeMS: 30000 }
|
[js_test:all_config_servers_blackholed_from_mongos] 2015-12-10T16:29:29.426-0500 s20020| 2015-12-10T16:29:29.426-0500 D ASIO [NetworkInterfaceASIO-ShardRegistry-0] Connecting to kaloianm-mac.local:20016
|
[js_test:all_config_servers_blackholed_from_mongos] 2015-12-10T16:29:29.427-0500 b20016| 2015-12-10T16:29:29.427-0500 I NETWORK [main] connection accepted from 10.4.126.94:65167 #15 (1 connection now open)
|
[js_test:all_config_servers_blackholed_from_mongos] 2015-12-10T16:29:29.428-0500 s20020| 2015-12-10T16:29:29.427-0500 D ASIO [NetworkInterfaceASIO-ShardRegistry-0] Starting asynchronous command 152 on host kaloianm-mac.local:20016
|
[js_test:all_config_servers_blackholed_from_mongos] 2015-12-10T16:29:29.428-0500 c20019| 2015-12-10T16:29:29.428-0500 I NETWORK [initandlisten] connection accepted from 10.4.126.94:65168 #16 (13 connections now open)
|
DISCARDED --> [js_test:all_config_servers_blackholed_from_mongos] 2015-12-10T16:29:29.428-0500 b20016| 2015-12-10T16:29:29.428-0500 I BRIDGE [thread6] Discarding "isMaster" command with arguments { isMaster: 1, hostInfo: "kaloianm-mac.local:20020" } from kaloianm-mac.local:20020
|
...
|
DIFFERENT TIMED OUT REQUEST --> [js_test:all_config_servers_blackholed_from_mongos] 2015-12-10T16:29:58.845-0500 s20020| 2015-12-10T16:29:58.845-0500 D ASIO [NetworkInterfaceASIO-ShardRegistry-0] Operation timed out: RemoteCommand 149 -- target:kaloianm-mac.local:20016 db:admin expDate:2015-12-10T16:29:58.838-0500 cmd:{ _getUserCacheGeneration: 1, maxTimeMS: 30000 }
|