Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-43904

When stepping down, step up doesn't filter out frozen nodes

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major - P3
    • Resolution: Unresolved
    • Affects Version/s: 3.6.14
    • Fix Version/s: 4.5 Required
    • Component/s: None

      Description

      One of the recommended ways [0] to force a particular node to become primary is to freeze all non-candidate nodes and then call replSetStepDown on the primary. As of MongoDB 3.6, that code attempts to step up a candidate (by calling replSetStepUp). However, that code doesn't exclude frozen nodes, and attempting to step up a frozen node will simply fail ("2019-10-09T00:24:05.517+0000 I REPL [conn352334] Not starting an election for a replSetStepUp request, since we are not electable due to: Not standing for election because I am still waiting for stepdown period to end at 2019-10-09T00:33:59.473+0000 (mask 0x20)"). This isn't particularly bad, since the unfrozen node will actually call for, and win, an election, but it does make failovers slower (up to electionTimeoutMillis slower, presumably).

      An alternative approach that we're using, that isn't explicitly documented, is to increase the priority of both the current and candidate node, and then run replSetStepDown. I've verified both in code and logs that this is effective at getting mongo to step up the candidate node consistently. It might be nice to document this approach, since I think it offers improvements over both approaches currently mentioned. Increasing the priority on just the candidate works, but tends to be slower since the "priority takeover" mechanism takes a few seconds to trigger, and provides less control than an explicit replSetStepDown.

      [0] https://docs.mongodb.com/manual/tutorial/force-member-to-be-primary/

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              backlog-server-repl Backlog - Replication Team
              Reporter:
              bartle David Bartley
              Participants:
              Votes:
              0 Vote for this issue
              Watchers:
              16 Start watching this issue

                Dates

                Created:
                Updated: