Uploaded image for project: 'MongoDB Database Tools'
  1. MongoDB Database Tools
  2. TOOLS-1665

Mongotools may block forever on dead connections

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 3.6.0-rc0, 3.4.15
    • Affects Version/s: None
    • Component/s: None
    • None
    • Not Needed

      Various mongotools (mongodump, mongoexport, mongorestore, mongofiles) disable socket timeouts on their mongodb connections. This allows these tools to work properly on networks with high latencies but it also leaves them vulnerable to blocking forever on "dead" connections. If a connection is dropped uncleanly (without a TCP shutdown) any concurrent read on that socket will continue to block forever. To avoid this, the tools can enable TCP keepalive which sends probes to determine if idles connections are still alive.

      It happens that mgo will enable TCP keepalive when a custom dialer is not present (https://github.com/10gen/mgo/blob/1826d82/server.go#L166). Since the mongotools use a custom SSL dialer that does not set keepalive, tools connecting over SSL may block forever if the underlying connection is dropped without a clean TCP shutdown.

      The following shows what happens if I shutoff wifi while running mongodump from a host over SSL. (I cancelled the dump after a few minutes otherwise it would hang forever):

      $ mongodump --host host.mongodb.net:27017 --authenticationDatabase admin --username user --password pass --ssl --db test
      2017-05-15T13:53:53.577-0400	writing test.mongomirror to
      2017-05-15T13:53:56.320-0400	[#########...............]  test.mongomirror  383925/1000000  (38.4%)
      2017-05-15T13:53:59.321-0400	[#########...............]  test.mongomirror  383925/1000000  (38.4%)
      2017-05-15T13:54:02.321-0400	[#########...............]  test.mongomirror  383925/1000000  (38.4%)
      2017-05-15T13:54:05.321-0400	[#########...............]  test.mongomirror  383925/1000000  (38.4%)
      2017-05-15T13:54:08.320-0400	[#########...............]  test.mongomirror  383925/1000000  (38.4%)
      2017-05-15T13:54:11.318-0400	[#########...............]  test.mongomirror  383925/1000000  (38.4%)
      2017-05-15T13:54:14.321-0400	[#########...............]  test.mongomirror  383925/1000000  (38.4%)
      2017-05-15T13:54:17.318-0400	[#########...............]  test.mongomirror  383925/1000000  (38.4%)
      2017-05-15T13:54:20.320-0400	[#########...............]  test.mongomirror  383925/1000000  (38.4%)
      2017-05-15T13:54:23.318-0400	[#########...............]  test.mongomirror  383925/1000000  (38.4%)
      2017-05-15T13:54:26.319-0400	[#########...............]  test.mongomirror  383925/1000000  (38.4%)
      2017-05-15T13:54:29.317-0400	[#########...............]  test.mongomirror  383925/1000000  (38.4%)
      2017-05-15T13:54:32.321-0400	[#########...............]  test.mongomirror  383925/1000000  (38.4%)
      2017-05-15T13:54:35.318-0400	[#########...............]  test.mongomirror  383925/1000000  (38.4%)
      2017-05-15T13:54:38.322-0400	[#########...............]  test.mongomirror  383925/1000000  (38.4%)
      2017-05-15T13:54:41.322-0400	[#########...............]  test.mongomirror  383925/1000000  (38.4%)
      2017-05-15T13:54:44.321-0400	[#########...............]  test.mongomirror  383925/1000000  (38.4%)
      2017-05-15T13:54:47.321-0400	[#########...............]  test.mongomirror  383925/1000000  (38.4%)
      2017-05-15T13:54:50.319-0400	[#########...............]  test.mongomirror  383925/1000000  (38.4%)
      2017-05-15T13:54:53.322-0400	[#########...............]  test.mongomirror  383925/1000000  (38.4%)
      2017-05-15T13:54:56.322-0400	[#########...............]  test.mongomirror  383925/1000000  (38.4%)
      2017-05-15T13:54:59.319-0400	[#########...............]  test.mongomirror  383925/1000000  (38.4%)
      2017-05-15T13:55:05.322-0400	[#########...............]  test.mongomirror  383925/1000000  (38.4%)
      2017-05-15T13:55:08.320-0400	[#########...............]  test.mongomirror  383925/1000000  (38.4%)
      2017-05-15T13:55:11.320-0400	[#########...............]  test.mongomirror  383925/1000000  (38.4%)
      2017-05-15T13:55:14.317-0400	[#########...............]  test.mongomirror  383925/1000000  (38.4%)
      2017-05-15T13:55:17.322-0400	[#########...............]  test.mongomirror  383925/1000000  (38.4%)
      2017-05-15T13:55:20.322-0400	[#########...............]  test.mongomirror  383925/1000000  (38.4%)
      2017-05-15T13:55:23.320-0400	[#########...............]  test.mongomirror  383925/1000000  (38.4%)
      2017-05-15T13:55:26.317-0400	[#########...............]  test.mongomirror  383925/1000000  (38.4%)
      2017-05-15T13:55:29.321-0400	[#########...............]  test.mongomirror  383925/1000000  (38.4%)
      2017-05-15T13:55:32.317-0400	[#########...............]  test.mongomirror  383925/1000000  (38.4%)
      2017-05-15T13:55:35.320-0400	[#########...............]  test.mongomirror  383925/1000000  (38.4%)
      2017-05-15T13:55:38.320-0400	[#########...............]  test.mongomirror  383925/1000000  (38.4%)
      2017-05-15T13:55:41.321-0400	[#########...............]  test.mongomirror  383925/1000000  (38.4%)
      2017-05-15T13:55:44.317-0400	[#########...............]  test.mongomirror  383925/1000000  (38.4%)
      2017-05-15T13:55:47.321-0400	[#########...............]  test.mongomirror  383925/1000000  (38.4%)
      2017-05-15T13:55:50.320-0400	[#########...............]  test.mongomirror  383925/1000000  (38.4%)
      2017-05-15T13:55:53.317-0400	[#########...............]  test.mongomirror  383925/1000000  (38.4%)
      2017-05-15T13:55:56.318-0400	[#########...............]  test.mongomirror  383925/1000000  (38.4%)
      ^C2017-05-15T13:55:56.624-0400	signal 'interrupt' received; attempting to shut down
      2017-05-15T13:55:59.321-0400	[#########...............]  test.mongomirror  383925/1000000  (38.4%)
      2017-05-15T13:56:02.320-0400	[#########...............]  test.mongomirror  383925/1000000  (38.4%)
      2017-05-15T13:56:05.317-0400	[#########...............]  test.mongomirror  383925/1000000  (38.4%)
      ^C2017-05-15T13:56:06.599-0400	signal 'interrupt' received; forcefully terminating
      

            Assignee:
            shane.harvey@mongodb.com Shane Harvey
            Reporter:
            shane.harvey@mongodb.com Shane Harvey
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: