[SERVER-37783] Adjust TCP_KEEPIDLE times based on $maxTimeMS Created: 26/Oct/18  Updated: 08/Jan/24

Status: Open
Project: Core Server
Component/s: Networking
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Benjamin Caimano (Inactive) Assignee: Backlog - Service Architecture
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to SERVER-38811 TCP_KEEPINTVL should be 1 second Closed
Assigned Teams:
Service Arch
Backwards Compatibility: Minor Change
Sprint: Service Arch 2018-11-05, Service Arch 2018-11-19, Service Arch 2018-12-03, Service Arch 2018-12-17, Service Arch 2018-12-31, Service Arch 2022-05-30
Participants:

 Description   

We should look into setting the TCP_KEEPIDLE/KEEPINTVL/KEEPCNT settings based on $maxTimeMS. This would allow us to distinguish the case of the remote host taking a long time to perform the requested operation, vs being blackholed. We need to do a bit of testing to find the optimal values here, but I think we should consider the host dead after something like 500ms <= maxTimeMS / 2 <= 15seconds. This would give us a chance to retry using a different host while still staying within the user-provided timeout. We may want to only do this for retryable operations.

Original Title: Make TCP_KEEPALIVE parameters configurable

See here.

AC: Investigate to see if this is something that we should pursue. Reach out to Kelsey and Kal for further context.



 Comments   
Comment by Lauren Lewis (Inactive) [ 21/Dec/21 ]

We haven’t heard back from you in at least 1 year, so I'm going to close this ticket. If this is still an issue for you, please provide additional information and we will reopen the ticket.

Generated at Thu Feb 08 04:47:00 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.