[SERVER-16903] Prevent query to single mongod from hanging during server power failure Created: 16/Jan/15 Updated: 09/Apr/20 |
|
| Status: | Open |
| Project: | Core Server |
| Component/s: | Networking |
| Affects Version/s: | 2.8.0-rc5 |
| Fix Version/s: | features we're not sure of |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Robert Guo (Inactive) | Assignee: | DO NOT USE - Backlog - Platform Team |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | 28qa, move-sa, platforms-re-triaged | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Ubuntu 14.04 |
||
| Issue Links: |
|
||||
| Participants: | |||||
| Description |
|
During power cycle testing of a single mongod, we found that a client query would hang when the server suddenly terminates due to a power failure. The hanging will result in a socket exception only after the tcp keep alive period has expired. The expiration time is tcp_keepalive_time + tcp_keepalive_intvl * tcp_keepalive_probes. The former two values are set to be a maximum of 5 minutes each, but the tcp_keepalive_probes is left at the system default, which is 9 for Windows and Linux. This would result in the socket terminating after 5 + 5 * 9 = 50 minutes. Which is a long time. A possible solution is to set the tcp_keepalive_probes to a lower number. |