Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-27160

Narrow race at startup between RSSync setting RECOVERING and BGSync setting ROLLBACK state

    XMLWordPrintableJSON

Details

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Major - P3 Major - P3
    • 3.5.13
    • None
    • Replication
    • None
    • Fully Compatible
    • ALL
    • Repl 2017-10-02, Repl 2017-10-23

    Description

      At startup, the RSSync thread is responsible for transitioning the node from STARTUP2 to RECOVERING. At the same time the BGSync thread may decide that rollback is necessary and try to transition to ROLLBACK. If BGSync wins the race and we go into ROLLBACK first, RSSync can then transition us to RECOVERING while rollback is still running. If this happens before the rollback process sets minValid, it can cause RSSync to go live as SECONDARY. In the right kind of network partition this could theoretically lead to us running and being elected PRIMARY.

      While I don't see any synchronization that would actively prevent this case, it seems fairly unlikely to happen in practice because it would require BGSync to complete several network round trips before RSSync is able to do the small amount of work it does before setting RECOVERING.

      Attachments

        Activity

          People

            siyuan.zhou@mongodb.com Siyuan Zhou
            mathias@mongodb.com Mathias Stearn
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: