Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-35058

Don't only rely on heartbeat to signal secondary positions in stepdown command

    • Type: Icon: Task Task
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 3.6.7, 4.0.2, 4.1.1
    • Affects Version/s: None
    • Component/s: Replication
    • Labels:
    • Fully Compatible
    • v4.0, v3.6
    • Repl 2018-06-04, Repl 2018-06-18, Repl 2018-07-02

      replSetStepDown command waits for a majority of nodes to catch up and one of them to be an eligible candidate, but such event is only signaled when processing heartbeat responses, which adds more delay to the handoff.

      The easiest and less efficient fix is to signal the condition variable whenever we update the last applied optime. The better solution is to replace the conditional variable with a waiter in _replicationWaiterList as in _awaitReplication_inlock(). A third solution is to call _awaitReplication_inlock(), which might not be desired since the condition stepdown command is waiting on is slightly different than w: majority + an eligible candidate specified in config.

            vesselina.ratcheva@mongodb.com Vesselina Ratcheva (Inactive)
            siyuan.zhou@mongodb.com Siyuan Zhou
            0 Vote for this issue
            7 Start watching this issue