Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-59212

Make sure node stepped down before waiting for catchup takeover in catchup_takeover_with_higher_config.js

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: 5.1 Required
    • Fix Version/s: 5.0.3, 4.4.9, 5.1.0-rc0
    • Component/s: Replication
    • Labels:
      None
    • Backwards Compatibility:
      Fully Compatible
    • Operating System:
      ALL
    • Backport Requested:
      v5.0, v4.4
    • Sprint:
      Repl 2021-08-23
    • Linked BF Score:
      150

      Description

      There is a race in catchup_takeover_with_higher_config.js. When waiting for node1 to catchup takeover node2, we check that all nodes see node1 as primary, however it's possible that node1 has not finished stepping down (due to node2 stepup) before we wait, so it remains primary state and the wait condition is trivially satisfied, meaning that the catchup takeover has not happened yet after the wait. Later after the failpoint on node2 is lifted, node2 successfully becomes primary (because node1 didn't do catchup takeover) and stays as primary till the end, causing this line to hang forever.

        Attachments

          Activity

            People

            Assignee:
            wenbin.zhu Wenbin Zhu
            Reporter:
            wenbin.zhu Wenbin Zhu
            Participants:
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: