Uploaded image for project: 'Drivers'
  1. Drivers
  2. DRIVERS-383

Enable and configure TCP Keepalive by default

    • Type: Icon: Improvement Improvement
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Component/s: None
    • Labels:
    • $i18n.getText("admin.common.words.hide")
      Key Status/Resolution FixVersion
      NODE-1024 Done 3.0.0
      PYTHON-1279 Fixed 3.5
      SCALA-312 Works as Designed 2.2.0
      JAVA-2531 Fixed 3.5.0
      CSHARP-1994 Fixed 2.7.1
      CXX-1363 Done
      PHPC-969 Done 1.4.0-beta1, 1.4.0
      CDRIVER-2176 Fixed 1.8.0
      PERL-780 Fixed 2.1.0
      GODRIVER-37 Fixed 0.0.1
      RUBY-1283 Fixed 2.5.1
      RUST-170 Fixed 1.1.0
      SWIFT-485 Works as Designed
      $i18n.getText("admin.common.words.show")
      #scriptField, #scriptField *{ border: 1px solid black; } #scriptField{ border-collapse: collapse; } #scriptField td { text-align: center; /* Center-align text in table cells */ } #scriptField td.key { text-align: left; /* Left-align text in the Key column */ } #scriptField a { text-decoration: none; /* Remove underlines from links */ border: none; /* Remove border from links */ } /* Add green background color to cells with FixVersion */ #scriptField td.hasFixVersion { background-color: #00FF00; /* Green color code */ } /* Center-align the first row headers */ #scriptField th { text-align: center; } Key Status/Resolution FixVersion NODE-1024 Done 3.0.0 PYTHON-1279 Fixed 3.5 SCALA-312 Works as Designed 2.2.0 JAVA-2531 Fixed 3.5.0 CSHARP-1994 Fixed 2.7.1 CXX-1363 Done PHPC-969 Done 1.4.0-beta1, 1.4.0 CDRIVER-2176 Fixed 1.8.0 PERL-780 Fixed 2.1.0 GODRIVER-37 Fixed 0.0.1 RUBY-1283 Fixed 2.5.1 RUST-170 Fixed 1.1.0 SWIFT-485 Works as Designed

      Problem Description

      keepalive in the Java driver (and other drivers) is disabled by default. This leaves the possibility of leaving downed server connections in the middle of a socket read stuck in a waiting state.

      We had a situation where a mongos server crashed leaving 100 open connections on the client side. When we recovered the mongos the Java driver still had 100 bad connections taken from the pool and would not open new ones.

      As part of this change, drivers should include in their documentation a link to the MongoDB Diagnostics FAQ keepalive section

      Specification
      1. A driver MUST enable TCP keepalive by default. This matches the behavior of the MongoDB server.
      2. A driver MUST deprecate TCP keepalive-related options in the connection string (and any other way that it is configured), as there is no demonstrated benefit to allowing it to be disabled. This also matches the behavior of the server.
      3. A driver SHOULD set tcp_keepalive_time to 300 seconds unless it determines that the system default is already less than that. If the driver is unable to determine the system default at all it should not attempt to change it. This matches the behavior of the server as well.
      4. A driver SHOULD set tcp_keepalive_intvl to 10 seconds unless it determines that the system default is already less than that. If the driver is unable to determine the system default at all it should not attempt to change it. This is not the current behavior of the server, but if accepted here it will be recommended. The reasoning is that with the default of 75 seconds for this value and a default of 9 probes, the actual time to failure is 300 + (75 * 9) = 975 sec = 16.25 minutes. With a 10 second interval between probes it becomes a more reasonable 6.5 minutes.
      5. A driver SHOULD set tcp_keepalive_cnt to 9 probes unless it determines that the system default is already less than that. If the driver is unable to determine the system default at all it should not attempt to change it.
      6. A driver MUST document how keepalive-related options are configured. Drivers that can set tcp_keepalive_time and tcp_keepalive_intvl to the values mandated above MUST document that they do so. Drivers that can not MUST document that they do not and link to appropriate MongoDB Diagnostics FAQ keepalive section for instructions on setting these values at the system level.

            Assignee:
            Unassigned Unassigned
            Reporter:
            roy.rim@mongodb.com Roy Rim
            Votes:
            5 Vote for this issue
            Watchers:
            27 Start watching this issue

              Created:
              Updated:
              Resolved: