Node.js driver fails to recover from unresponsive Primary

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Done
    • Priority: Major - P3
    • 2.2.22
    • Affects Version/s: 2.2.16
    • Component/s: None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      If the Primary server stops responding over the TCP connection, but the connection itself is not terminated (which can happen in case of network problems), then the driver gets stuck and can not failover to the other members of the replica set. The problem is easily reproducible by running replica set locally and issuing a SIGSTOP to the primary server.

      I have created a simple test application (with setup scripts included), that can be used to illustrate the problem:
      https://github.com/OleksandrChekhovskyi/mongo-replset-test
      Exact repro steps are described in the README file.

      It seems that the problem is that ismaster ping is done sequentially for all servers, so if the first one gets stuck, it's never going to get the new configuration, because it keeps disconnecting/reconnecting.

            Assignee:
            Christian Amor Kvalheim
            Reporter:
            Oleksandr Chekhovskyi [X]
            None
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: