Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-57186

Catchup takeover should not happen when last applied optime is in current term

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 5.1.0
    • Component/s: None
    • Backwards Compatibility:
      Fully Compatible
    • Operating System:
      ALL
    • Backport Requested:
      v5.0
    • Sprint:
      Repl 2021-06-28
    • Case:

      Description

      We currently initiate catchup takeover when we get a heartbeat, no election is occurring and the primary's optime is behind a secondary node's optime. In a chaining situation, the primary's optime could be staler than our own optime because we're receiving writes through a different path (OplogFetcher), and not updating the primary's optime based on it. This will cause us to initiate catchup takeover, and immediately cancel it when we realize another secondary is ahead of us, as in HELP-24655

      I believe that we are potentially in a catchup situation only when our last applied optime's term is less than our election term; if it is the same, that means the current primary has successfully caught up. Checking this would avoid scheduling and canceling catchup takeover.

        Attachments

          Activity

            People

            Assignee:
            vesselina.ratcheva Vesselina Ratcheva
            Reporter:
            matthew.russotto Matthew Russotto
            Participants:
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: