The server should consider enabling TCP_USER_TIMEOUT for the same reasons described in DRIVERS-1692. This solves a problem where an operation could block for ~16 minutes instead of ~5 minutes (the server's default TCP keepalive period).
If the server does not do this automatically, admins can control this timeout behavior through the net.ipv4.tcp_retries2 setting.
tcp_retries2 - INTEGER
This value influences the timeout of an alive TCP connection,
when RTO retransmissions remain unacknowledged.
Given a value of N, a hypothetical TCP connection following
exponential backoff with an initial RTO of TCP_RTO_MIN would
retransmit N times before killing the connection at the (N+1)th RTO.
The default value of 15 yields a hypothetical timeout of 924.6
seconds and is a lower bound for the effective timeout.
TCP will effectively time out at the first RTO which exceeds the
RFC 1122 recommends at least 100 seconds for the timeout,
which corresponds to a value of at least 8.