[SERVER-56854] Provide the ability for RSM requests to timeout and mark the server as failed Created: 11/May/21 Updated: 29/Oct/23 Resolved: 28/May/21 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Networking |
| Affects Version/s: | None |
| Fix Version/s: | 4.0.25, 4.2 Required |
| Type: | New Feature | Priority: | Critical - P2 |
| Reporter: | Lamont Nelson | Assignee: | Lamont Nelson |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||
| Backport Requested: |
v4.2, v4.0
|
||||||||||||||||
| Sprint: | Sharding 2021-05-17 | ||||||||||||||||
| Participants: | |||||||||||||||||
| Case: | (copied to CRM) | ||||||||||||||||
| Description |
|
Currently, a mongos node can send a hello request to replica set members and not hear a response indefinitely. In this case, the operation will not return until the connection on the mongos side has a timeout, which could be several minutes based on TCP keepalive settings. This ticket is to create an application timeout mechanism that allows the RSM to make progress monitoring other nodes in the presence of TCP blackholes or similar network failures. The timeout should be on the order of seconds to ensure cluster availability. |
| Comments |
| Comment by Githook User [ 28/May/21 ] |
|
Author: {'name': 'LaMont Nelson', 'email': 'lamont.nelson@mongodb.com', 'username': 'lamontnelson'}Message: |
| Comment by Lamont Nelson [ 28/May/21 ] |