RSM fails to mark node as Unknown in response to failed hellos at shutdown

    • Type: Bug
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: 9.0.0-rc0
    • Component/s: None
    • None
    • Networking & Observability
    • ALL
    • N&O 2026-04-27
    • 200
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      The path of a failed hello due to shutdown prior to SERVER-120733:

      • RSM receives an ok: 0 hello response with ShutdownInProgress
      • It incorrectly only checks whether the RPC succeeded, without checking the BSON for ok: 0. As such, it calls into onHelloSuccess despite failing the hello.
      • This calls into StreamableReplicaSetMonitor::onServerHeartbeatSucceededEvent, which constructs a HelloOutcome with _success true, and passes it to onServerDescription
      • onServerDescription constructs a new ServerDescription from the HelloOutcome.
      • Within the "success path" of the ctor, it calls parseTypeFromHelloReply, which then again checks for ok: 0 and correctly sets the server type to Unknown, making it unroutable.

      Since this all happens on the supposed "success" path, we never see "Host failed in replica set", but we do see a topology description update log message (I dont have the link to the logs to corroborate this, but you can see this in passing versions of the test prior to SERVER-120733).

      SERVER-120733 updates StreamableReplicaSetMonitor to properly take the error path on failed hellos (which is why we now correctly see log messages indicating so), but this has the unfortunate effect of skipping the ServerDescription constructor I mentioned above. Instead, it goes into computeErrorActions with isApplicationOperation: false. As I noted in an earlier message, this does not result in the server being marked unknown and thus is still routable.

            Assignee:
            Unassigned
            Reporter:
            Patrick Freed
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated: