Chained secondary can read step-up no-op oplog entry beyond new primary’s oplog visibility timestamp

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Replication
    • ALL
    • Repl 2026-03-30
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      In a replica set with chained replication, a secondary that is syncing from another secondary can replicate the step-up no-op oplog entry from beyond the sync source's oplog visibility timestamp, when the sync source steps up to become primary.

      When a secondary is reading the oplog from another secondary, it uses the kLastApplied read source as secondaries can apply oplog entries in parallel. Syncing from a primary uses the kNoTimestamp read source, with the understanding that oplog cursors on the primary are gated behind the oplog visibility timestamp mechanism. Secondaries should not be able to read beyond that timestamp, meaning that the kNoTimestamp read source usage should not affect consistency.

      However, there is no concurrency control between step-up and oplog cursor yield and restore. In practice, the secondary oplog cursor on the sync source secondary should yield, and the storage engine resources are reacquired using the kNoTimestamp read source. However, this does not happen until the new primary updates its write ability in the topology coordinator. Prior to this, the replication coordinator will write a step-up no-op oplog entry here. If an active oplog cursor fetches from the new primary at this point, it will still read at kLastApplied read source. This read source does not adhere to oplog visibility rules.

      As a result, if there is a delay in the oplog visibility thread execution, an oplog cursor on the new primary may read the no-op entry before the oplog visibility advances to the no-op entry's timestamp. In terms of impact, we believe that the oplog cursor can only read the no-op entry, and no additional entries afterwards. This limits the possibility of a durability problem. However, we only theorized this and were unable to confirm this in the reproducer, so whoever picks this up should ensure that there aren't any durability concerns here.

            Assignee:
            Unassigned
            Reporter:
            Ali Mir
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated: