Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-38366

Replica set nodes update the term without verifying the config version can lead to unnecessary stepdown.

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Won't Fix
    • Affects Version/s: 4.1.5
    • Fix Version/s: None
    • Component/s: Replication
    • Labels:
      None
    • Operating System:
      ALL

      Description

      Currently, the replica set nodes can learn about the higher term via heartbeart, oplog fetcher and cmds (like find & getmore).  When the term is learnt via oplog fetcher,  it calls ReplicationCoordinatorImpl::_processReplSetMetadata_inlock which updates the term only if the config version of the sync source is same as mine. We are missing that config version check in heartbeat, find and getmore before updating the term.

      Also to be noted is that in ReplicationCoordinatorImpl::_handleHeartbeatResponse we update the term in 2 places     

       

      Note : This bug was captured for this particular upgrade/downgrade sequence (pv1->pv0->pv1) where it lead to unnecessary stepdown.

      1) Start a replica set in pv1.

      2) Insert some document in pv1 (for term =1)

      3)Downgrade to pv0 while the secondaries are still replicating the documents from previous pv1 (term =1)

      4) Upgrade to pv1 before the secondaries downgrade to pv0.

      5) The current primary which is in term 0 receives heartbeat from the secondaries which think they are still in term 1(from step 1)

      6) As a result, the current primary updates its term to 1 and steps down and starts a new election for term 2.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              backlog-server-repl Backlog - Replication Team
              Reporter:
              suganthi.mani Suganthi Mani
              Participants:
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: