-
Type: Improvement
-
Resolution: Unresolved
-
Priority: Major - P3
-
Affects Version/s: 2.8.0-rc5
-
Component/s: Networking
-
Environment:Ubuntu 14.04
During power cycle testing of a single mongod, we found that a client query would hang when the server suddenly terminates due to a power failure.
The hanging will result in a socket exception only after the tcp keep alive period has expired. The expiration time is tcp_keepalive_time + tcp_keepalive_intvl * tcp_keepalive_probes. The former two values are set to be a maximum of 5 minutes each, but the tcp_keepalive_probes is left at the system default, which is 9 for Windows and Linux. This would result in the socket terminating after 5 + 5 * 9 = 50 minutes. Which is a long time.
A possible solution is to set the tcp_keepalive_probes to a lower number.