Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-35442

stepdown global lock acqusition should use wait time, not freeze time

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.0.2, 4.1.2
    • Component/s: Replication
    • Labels:
    • Backwards Compatibility:
      Fully Compatible
    • Operating System:
      ALL
    • Backport Requested:
      v4.0
    • Sprint:
      Repl 2018-07-16, Repl 2018-07-30

      Description

      When we added interruptibility to lock acquisitions, we chose the "stepDownUntil" deadline for the global lock acquisition timeout in ReplicationCoordinatorImpl::stepDown(). This unfortunately-named variable is actually the freeze time, which dictates how long a node will wait before attempting to become primary again, after the stepdown has finished and the function has returned.
      Instead, we should be using the "waitUntil" deadline, which is the time the user is willing to wait for the stepdown to complete before it gives up and returns an error.

      This function is used by both the replicaSetStepDown and shutdown commands, and so this bug affects both.

        Attachments

          Activity

            People

            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: