Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-21854

NetworkInterface ASIO skips timing out requests on OSX

    XMLWordPrintableJSON

Details

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Major - P3 Major - P3
    • None
    • 3.2.0
    • Networking
    • None
    • Fully Compatible
    • ALL
    • Hide

      On an OSX build (doesn't matter if debug or release):

      scons --dbg=on --opt=off --allocator=system -j8 --osx-version-min=10.7 --libc++ --mute mongod mongos mongo mongobridge
       
      python buildscripts/resmoke.py --repeat=50 --log=file --executor=sharding --storageEngine=wiredTiger jstests/sharding/all_config_servers_blackholed_from_mongos.js
      

      Show
      On an OSX build (doesn't matter if debug or release): scons --dbg=on --opt=off --allocator=system -j8 --osx-version-min=10.7 --libc++ --mute mongod mongos mongo mongobridge   python buildscripts/resmoke.py --repeat=50 --log=file --executor=sharding --storageEngine=wiredTiger jstests/sharding/all_config_servers_blackholed_from_mongos.js
    • Platforms 10 (02/19/16), Platforms 12 (04/01/16), Platforms 13 (04/22/16), Platforms 14 (05/13/16), Platforms 15 (06/03/16), Platforms 16 (06/24/16), Platforms 17 (07/15/16), Platforms 18 (08/05/16), Platforms 2016-08-26, Platforms 2016-09-19, Platforms 2016-10-10, Platforms 2016-11-21, Platforms 2017-01-23

    Description

      This bug only seems to be reproducible on OSX and is most easily visible when running the 'blackhole' test, where mongobridge drops all requests against the config servers.

      From inspecting the logs, it seems that occasionally network requests with timeout do not get timed out and just sit there.

      I am attaching the complete logs, but here is the relevant snippets (with verbosity 3 and some extra tracing of the request ids). From the snippet below it can be seen that request 151 never times out, even though it has an expiration time of 30 seconds after the request was scheduled (2015-12-10T16:29:59.426-0500):

      [js_test:all_config_servers_blackholed_from_mongos] 2015-12-10T16:29:29.426-0500 s20020| 2015-12-10T16:29:29.425-0500 D SHARDING [conn2] command: NonExistentDB.$cmd { insert: "TestColl", documents: [ { _id: 0.0, value: "This value will never be inserted" } ], ordered: true } ntoreturn: 1 options: 0
      [js_test:all_config_servers_blackholed_from_mongos] 2015-12-10T16:29:29.426-0500 s20020| 2015-12-10T16:29:29.426-0500 D EXECUTOR [conn2] Scheduling remote command request: RemoteCommand 151 -- target:kaloianm-mac.local:20016 db:config expDate:2015-12-10T16:29:59.426-0500 cmd:{ find: "databases", filter: { _id: "NonExistentDB" }, readConcern: { level: "majority", afterOpTime: { ts: Timestamp 1449782939000|15, t: 1 } }, limit: 1, maxTimeMS: 30000 }
      [js_test:all_config_servers_blackholed_from_mongos] 2015-12-10T16:29:29.426-0500 s20020| 2015-12-10T16:29:29.426-0500 D ASIO     [conn2] startCommand: RemoteCommand 151 -- target:kaloianm-mac.local:20016 db:config expDate:2015-12-10T16:29:59.426-0500 cmd:{ find: "databases", filter: { _id: "NonExistentDB" }, readConcern: { level: "majority", afterOpTime: { ts: Timestamp 1449782939000|15, t: 1 } }, limit: 1, maxTimeMS: 30000 }
      [js_test:all_config_servers_blackholed_from_mongos] 2015-12-10T16:29:29.426-0500 s20020| 2015-12-10T16:29:29.426-0500 D ASIO     [NetworkInterfaceASIO-ShardRegistry-0] Connecting to kaloianm-mac.local:20016
      [js_test:all_config_servers_blackholed_from_mongos] 2015-12-10T16:29:29.427-0500 b20016| 2015-12-10T16:29:29.427-0500 I NETWORK  [main] connection accepted from 10.4.126.94:65167 #15 (1 connection now open)
      [js_test:all_config_servers_blackholed_from_mongos] 2015-12-10T16:29:29.428-0500 s20020| 2015-12-10T16:29:29.427-0500 D ASIO     [NetworkInterfaceASIO-ShardRegistry-0] Starting asynchronous command 152 on host kaloianm-mac.local:20016
      [js_test:all_config_servers_blackholed_from_mongos] 2015-12-10T16:29:29.428-0500 c20019| 2015-12-10T16:29:29.428-0500 I NETWORK  [initandlisten] connection accepted from 10.4.126.94:65168 #16 (13 connections now open)
      DISCARDED --> [js_test:all_config_servers_blackholed_from_mongos] 2015-12-10T16:29:29.428-0500 b20016| 2015-12-10T16:29:29.428-0500 I BRIDGE   [thread6] Discarding "isMaster" command with arguments { isMaster: 1, hostInfo: "kaloianm-mac.local:20020" } from kaloianm-mac.local:20020
      ...
      DIFFERENT TIMED OUT REQUEST --> [js_test:all_config_servers_blackholed_from_mongos] 2015-12-10T16:29:58.845-0500 s20020| 2015-12-10T16:29:58.845-0500 D ASIO     [NetworkInterfaceASIO-ShardRegistry-0] Operation timed out: RemoteCommand 149 -- target:kaloianm-mac.local:20016 db:admin expDate:2015-12-10T16:29:58.838-0500 cmd:{ _getUserCacheGeneration: 1, maxTimeMS: 30000 }
      

      Attachments

        Activity

          People

            samantha.ritter@mongodb.com Samantha Ritter (Inactive)
            kaloian.manassiev@mongodb.com Kaloian Manassiev
            Votes:
            0 Vote for this issue
            Watchers:
            13 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: