[SERVER-57468] Enable TCP_USER_TIMEOUT by default Created: 04/Jun/21 Updated: 16/Feb/23 |
|
| Status: | Backlog |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Shane Harvey | Assignee: | Backlog - Service Architecture |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | sa-remove-fv-backlog-22 | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||
| Assigned Teams: |
Service Arch
|
||||||||||||||||||||||||||||
| Backport Requested: |
v6.1, v6.0, v5.0, v4.4, v4.2
|
||||||||||||||||||||||||||||
| Sprint: | Service Arch 2022-12-26, Service Arch 2022-08-22, Service Arch 2022-09-05, Service Arch 2022-09-19, Service Arch 2022-10-31, Service Arch 2022-11-14, Service Arch 2022-11-28, Service Arch 2022-12-12, Service Arch 2023-01-09, Service Arch 2023-01-23, Service Arch 2023-02-06 | ||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||
| Case: | (copied to CRM) | ||||||||||||||||||||||||||||
| Description |
|
The server should consider enabling TCP_USER_TIMEOUT for the same reasons described in DRIVERS-1692. This solves a problem where an operation could block for ~16 minutes instead of ~5 minutes (the server's default TCP keepalive period). If the server does not do this automatically, admins can control this timeout behavior through the net.ipv4.tcp_retries2 setting.
https://www.kernel.org/doc/Documentation/networking/ip-sysctl.txt |
| Comments |
| Comment by Shane Harvey [ 19/Aug/22 ] | ||
|
Agreed a non-default option SGTM. | ||
| Comment by Billy Donahue [ 19/Aug/22 ] | ||
|
shane.harvey@mongodb.com I think we need to implement the option so we have it in our back pocket. Unlike tcp_retries2, it's a per-connection setting, so that's a big advantage. Maybe we can LOG a warning if TCP_USER_TIMEOUT is set to a value that's incompatible with the 3 TCP_KEEP* values. I'm also not thrilled that our TCP keepalive knobs are all but hardcoded.
TCP_KEEPCNT is missing altogether and we don't ever adjust it. We should add that capability, too. It seems like these will need to be adjustable in exactly the same way as TCP_USER_TIMEOUT for this to make sense as a holistic product feature. | ||
| Comment by Shane Harvey [ 19/Aug/22 ] | ||
|
When implementing this feature we should be careful not to unintentionally increase the timeout for users that are already setting tcp_retries2 at the OS level. For example it would not be ideal to unconditionally set TCP_USER_TIMEOUT because it overrides tcp_retries2 and the user would end up with a longer retry period than they wanted. It could be simpler to implement DRIVERS-1707 in mongos instead. The main idea in DRIVERS-1707 is to cancel in flight operations when a SDAM heartbeat fails with a network timeout. One caveat is that DRIVERS-1707 would only handle cluster connections on the mongos side, not intra replica set connections (eg. agg $out on a secondary). Another important note is that TCP_USER_TIMEOUT overrides the TCP_KEEPCNT for keepalive, hence why the recommendation is to set TCP_USER_TIMEOUT to slightly less than TCP_KEEPIDLE + TCP_KEEPINTVL * TCP_KEEPCNT. | ||
| Comment by Billy Donahue [ 18/Aug/22 ] | ||
|
sneak peek at this. |