Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-41031

After an unreachable node is added and removed from the replica set, the other replica set members continue to send heartbeat to this removed node

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major - P3
    • Resolution: Unresolved
    • Affects Version/s: 4.0.9
    • Fix Version/s: 5.0 Desired
    • Component/s: Replication
    • Labels:
      None
    • Operating System:
      ALL
    • Steps To Reproduce:
      Hide

      Steps to reproduce:
      1. Start a replica set on 4.0.9
      2. Connect to the primary, then run rs.add("NEW_HOST:2017"). (NEW_HOST is a server that the replica set members can't connect to).
      3. There will be heartbeat failure to this node. This is expected.
      4. Run rs.remove("NEW_HOST:2017") to remove it from the replica set.
      5. After this node is removed from the replica set, the replica set members still send heartbeat to this node, which is unexpected.

      Show
      Steps to reproduce: 1. Start a replica set on 4.0.9 2. Connect to the primary, then run rs.add("NEW_HOST:2017") . (NEW_HOST is a server that the replica set members can't connect to). 3. There will be heartbeat failure to this node. This is expected. 4. Run rs.remove("NEW_HOST:2017") to remove it from the replica set. 5. After this node is removed from the replica set, the replica set members still send heartbeat to this node, which is unexpected.
    • Sprint:
      Repl 2019-07-01, Repl 2019-07-15, Repl 2019-07-29, Repl 2019-08-12, Repl 2019-09-23, Repl 2019-10-07, Repl 2019-10-21, Repl 2019-11-04
    • Case:

      Description

      • This issue is reproducible on 4.0 but not on 3.6.
      • Stepping down the primary doesn't seem to stop those heartbeats to the removed node.
      • In 4.0, if NEW_HOST is reachable from the replica set members (no matter whether the mongod process is running on NEW_HOST or not), the issue is not reproducible.
         
        Reproduced on RHEL7. I didn't try other platforms, so not sure if this is platform specific.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              backlog-server-repl Backlog - Replication Team
              Reporter:
              linda.qin Linda Qin
              Participants:
              Votes:
              1 Vote for this issue
              Watchers:
              18 Start watching this issue

                Dates

                Created:
                Updated: