Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-28377

Do not check that remote last applied is ahead of local last fetched in OplogFetcher first batch during initial sync

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.4.4, 3.5.6
    • Component/s: Replication
    • Labels:
    • Backwards Compatibility:
      Fully Compatible
    • Operating System:
      ALL
    • Backport Requested:
      v3.4
    • Sprint:
      Repl 2017-04-17

      Description

      There is a race where the sync source's most recent oplog entries could become visible and be read by downstream nodes before the sync source updates its heartbeat map with its new last applied OpTime. This can cause downstream nodes to get stale lastAppliedOpTimes in metadata which can be a problem for OplogFetcher::checkRemoteOplogStart.

      This will only cause the OplogFetcher to return early and choose a new sync source, so it should not cause harm beyond unnecessary sync source changes and some very quick initial sync restarts.

      We should remove the check that the remote last applied OpTime is greater than or equal to the local last fetched OpTime in OplogFetcher::checkRemoteOplogStart when "requireFresherSyncSource" is false. This will also require changing the comments that explain the boolean's meaning.

      An alternative is to use the max of the metadata lastOpApplied and the last OpTime in the batch as the remote last applied OpTime in OplogFetcher::checkRemoteOplogStart.

        Attachments

          Issue Links

            Activity

              People

              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: