Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-35952

Secondaries can fall off the oplog even if necessary oplog entries exist in cluster

    • Type: Icon: Bug Bug
    • Resolution: Works as Designed
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Replication
    • Labels:
      None
    • Replication
    • ALL
    • Hide

      Consider a replica set chaining as follows:

      P -> S1 -> S2

      Consider 4 points in time:

      t1 -> t2 -> t3 -> t4

      1. Current time is t4.
      2. Primary P has oplog entries back to time t3.
      3. Secondary S1 has oplog entries further back to t1 due to replication lag.
      4. Secondary S2 requires oplog entries back to t2.
      5. Secondary S2 realizes its sync source S1 is lagged by more than 30 seconds and re-evaluates sync sources.
      6. S2 disqualifies S1 from consideration because it is more than 30 seconds behind.
      7. S2 considers P but disqualifies it because it only has oplog entries back to t3.
      8. S2 cannot find a valid sync source and falls off the oplog even though S1 has the oplog entries it requires to catch up.
      Show
      Consider a replica set chaining as follows: P -> S1 -> S2 Consider 4 points in time: t1 -> t2 -> t3 -> t4 Current time is t4 . Primary P  has oplog entries back to time t3 . Secondary S1  has oplog entries further back to t1 due to replication lag. Secondary S2 requires oplog entries back to t2 . Secondary S2  realizes its sync source S1  is lagged by more than 30 seconds and re-evaluates sync sources. S2 disqualifies S1 from consideration because it is more than 30 seconds behind. S2 considers P but disqualifies it because it only has oplog entries back to t3 . S2 cannot find a valid sync source and falls off the oplog even though S1 has the oplog entries it requires to catch up.

      When evaluating sync sources, we only consider current state and lag, not which oplog entries each candidate has. This can lead to situations where a chained secondary falls off the oplog when switching sync sources (due to sync source re-evaluation) even though the necessary oplog entries exist in the replica set as a whole.

            Assignee:
            backlog-server-repl [DO NOT USE] Backlog - Replication Team
            Reporter:
            james.kovacs@mongodb.com James Kovacs
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

              Created:
              Updated:
              Resolved: