Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-29693

"lastAppliedOpTime" value can sometimes be set to an optime ahead of uncommitted operations

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 3.5.13
    • Affects Version/s: None
    • Component/s: Replication, Storage
    • None
    • Fully Compatible
    • ALL
    • Storage 2017-08-21, Storage 2017-09-11
    • 16

      Currently, a PRIMARY node updates its lastAppliedOpTime value whenever any write operation commits: https://github.com/mongodb/mongo/blob/8ec35cb932d07eb034beee2b1938800675fb3c0c/src/mongo/db/repl/oplog.cpp#L373

      On document-level-locking storage engines, writes can commit out of optime order. This means that the use of lastAppliedOpTime to determine when it is safe to allow reads is not providing an assumed guarantee: that all operations prior to the lastAppliedOpTime will be visible.
      In 3.4, the only operation that could run afoul of this is a read operation with readConcern: { level: "local", afterOpTime: xxxx } (or afterOpTime with level majority in a one-voting-node replica set), but such operations are currently disallowed. Thus, this bug is hidden from normal operation.
      In a future release, we expect that this guarantee will be critical for the correct operation of read operations with read concerns utilizing "afterClusterTime".

            Assignee:
            milkie@mongodb.com Eric Milkie
            Reporter:
            milkie@mongodb.com Eric Milkie
            Votes:
            1 Vote for this issue
            Watchers:
            8 Start watching this issue

              Created:
              Updated:
              Resolved: