Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-28158

SnapshotThread should stop using LogicalClock to trigger snapshots

    • Type: Icon: Task Task
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • 3.5.5
    • Affects Version/s: 3.5.3
    • Component/s: Replication
    • Labels:
      None
    • Fully Compatible
    • Sharding 2017-03-06, Sharding 2017-03-27

      Since it can advance without any writes.

      Code in question:
      https://github.com/mongodb/mongo/blob/r3.5.3/src/mongo/db/repl/oplog.cpp#L1196

      Rationale:
      getLastSetTimestamp gives you the last set logical time from the LogicalClock. With the implementation of the Lamport clock, the clock can advance even without any operations being applied. This will also avoid the following scenario:

      0. Secondary is replicated till ts = T9.
      1. Primary created new op with ts = T10.
      2. Secondary got message from Primary and advanced clock to T10.
      3. Secondary snapshot thread noticed time changed to T10 and started taking snapshot.
      4. Secondary has only applied till T9, so it gets a snapshot of T9.
      5. Secondary applies oplog entry with T10.
      6. Secondary gets a read with readConcern majority and afterOpTime T10.
      7. Read times out since the snapshot is in T9 and snapshot thread does not create new snapshot because it thinks the time time is already in T10.

      Also note that the newTimestampNotifier will also need to change to trigger after the actual applied/generated opTime was changed (as opposed to the LogicalClock)

            Assignee:
            randolph@mongodb.com Randolph Tan
            Reporter:
            randolph@mongodb.com Randolph Tan
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: