Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-34728

Heartbeats are used to advance replication commit point

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Works as Designed
    • Affects Version/s: 3.4.14, 3.6.4
    • Fix Version/s: None
    • Component/s: Replication
    • Labels:
      None
    • Operating System:
      ALL
    • Backport Requested:
      v3.6, v3.4
    • Sprint:
      Repl 2018-05-21

      Description

      The TopologyCoordinator uses _memberData.getLastAppliedOpTime() to advance the commit point on primaries. _memberData.getLastAppliedOpTime() returns _lastAppliedOpTime, which is set in advanceLastAppliedOpTime(). That is called from setUpValues which is called on heartbeat responses.

      This is a problem because imagine if we have 3 nodes A, B, and C. A starts as the primary and commits OpTime(Timestamp(1,1), 1) to all nodes. A writes OpTime(Timestamp(2,1), 1) and it replicates to B, but A never receives the acknowledgement and never commits it. A also writes OpTime(Timestamp(3,1), 1). B then runs for election in term 2 and C votes for it since it's ahead. A then steps down and runs for election again in term 3. C votes for it and it wins. B then takes a write at OpTime(Timestamp(4,1), 2) and A takes a write at OpTime(Timestamp(5,1), 3). A then gets a heartbeat from B and hears that it is at OpTime(Timestamp(4,1), 2) and commits all operations less than that, including OpTime(Timestamp(3,1), 1), which is only on itself. If B then runs for election again in term 4, and C votes for it, then A can begin syncing from B and roll back it's majority committed write.

      It's possible something will prevent the above from happening exactly as stated and it may be easier to reproduce in a 5 node set. That said, it is definitely a problem (and possible currently) for a node to commit operations on its branch of history based on oplog entries with higher optimes than the commit point, but lower terms than its current term (which would not cause a step down).

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              tess.avitabile Tess Avitabile
              Reporter:
              judah.schvimer Judah Schvimer
              Participants:
              Votes:
              0 Vote for this issue
              Watchers:
              16 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: