Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-52661

lastDurable is set after it is cleared in initial sync

    • Type: Icon: Bug Bug
    • Resolution: Duplicate
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 4.9 Required
    • Component/s: None
    • None
    • Replication
    • ALL
    • 56

      When we start up a new initial sync attempt, it seems possible for lastDurable to be set after it has been cleared. For example, this sequence:

      1. After an initial sync attempt, the last applied oplog entry is not yet durable
      2. A new initial sync attempt starts and the oplogApplier in initial sync is shut down
      3. The journaling thread gets the lastApplied from the replication coordinator
      4. lastApplied and lastDurable are reset in the replication coordinator by initial sync
      5. lastDurable is set to lastApplied by the journaling thread

      This was discovered after SERVER-47898. In that ticket, if we set lastDurable, we will also advance lastApplied. Thus, lastApplied would be set after the above sequence occurs. After that, we would go into this invariant, which would fail.

      Just as a small note, it seems like unexpectedly setting lastDurable hasn't been causing noticeable issues until now. SERVER-47898 was reverted after a few days, and the BFs from the invariant failing also went away after it was reverted.

      CC lingzhi.deng, matthew.russotto

            Assignee:
            backlog-server-repl [DO NOT USE] Backlog - Replication Team
            Reporter:
            xuerui.fa@mongodb.com Xuerui Fa
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: