Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-56854

Provide the ability for RSM requests to timeout and mark the server as failed

    XMLWordPrintable

    Details

    • Type: New Feature
    • Status: Closed
    • Priority: Critical - P2
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.2 Required, 4.0.25
    • Component/s: Networking
    • Labels:
      None
    • Backwards Compatibility:
      Fully Compatible
    • Backport Requested:
      v4.2, v4.0
    • Sprint:
      Sharding 2021-05-17
    • Case:

      Description

      Currently, a mongos node can send a hello request to replica set members and not hear a response indefinitely. In this case, the operation will not return until the connection on the mongos side has a timeout, which could be several minutes based on TCP keepalive settings.

      This ticket is to create an application timeout mechanism that allows the RSM to make progress monitoring other nodes in the presence of TCP blackholes or similar network failures. The timeout should be on the order of seconds to ensure cluster availability.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              lamont.nelson Lamont Nelson
              Reporter:
              lamont.nelson Lamont Nelson
              Participants:
              Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: