Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-12170

Do not call relinquish() when not vetoing an election

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • 2.4.10, 2.5.5
    • Affects Version/s: 2.5.4
    • Component/s: Replication
    • Labels:
      None
    • ALL

      Issue Status as of March 31, 2014

      ISSUE SUMMARY

      In the election logic, if a node is not vetoing an election, a call to the relinquish() method is made that would step down a primary or change the state of a node from STARTUP2 to RECOVERY. This call is not necessary and can delay or time out the election, due to a write lock taken to clear out the write buffer.

      USER IMPACT

      This bug can delay elections.

      SOLUTION

      The fix was to remove the unnecessary call to relinquish().

      WORKAROUNDS

      None

      AFFECTED VERSIONS

      All recent production release versions up to 2.4.9 are affected.

      PATCHES

      The fix is included in the 2.4.10 production release and the 2.5.5 development version, which will evolve into the 2.6.0 production release.

      Original Description

      The call to relinquish() does nothing good, and causes two bugs:
      1. It is possible to transition from STARTUP2 to RECOVERING early, which causes incorrect RS logic later.
      2. The call to relinquish() attempts to grab a global write lock while holding the rs mutex, which may delay heartbeats and elections if a long-running write operation (such as a foreground index build) is already in progress.

            Assignee:
            matt.dannenberg Matt Dannenberg
            Reporter:
            matt.dannenberg Matt Dannenberg
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: