During power cycle testing of a single mongod, we found that a client query would hang when the server suddenly terminates due to a power failure.
The hanging will result in a socket exception only after the tcp keep alive period has expired. The expiration time is tcp_keepalive_time + tcp_keepalive_intvl * tcp_keepalive_probes. The former two values are set to be a maximum of 5 minutes each, but the tcp_keepalive_probes is left at the system default, which is 9 for Windows and Linux. This would result in the socket terminating after 5 + 5 * 9 = 50 minutes. Which is a long time.
A possible solution is to set the tcp_keepalive_probes to a lower number.