Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-49159

Return NotPrimaryOrSecondary if currentTime is uninitialized in waitForReadConcernImpl

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.8.0
    • Component/s: Replication
    • Labels:
      None
    • Backwards Compatibility:
      Fully Compatible
    • Operating System:
      ALL
    • Backport Requested:
      v4.4
    • Sprint:
      Repl 2020-07-27, Repl 2020-09-21, Repl 2020-10-05
    • Linked BF Score:
      9

      Description

      Initial sync is currently resumable after certain transient failures, including brief outages due to sync source restart. When a sync source goes down and back up again, we generally expect that to be observed as a network error. However, there is a specific window in which attempts against such a freshly restarted sync source can error in a way that is fatal to the initial sync process. It is possible for the oplog fetcher to try to read from the remote oplog while that node is still in STARTUP state and is at (0,0) appliedThrough and clusterTime; in such a scenario the read will fail as the sync source is unable to satisfy the afterClusterTime (0,1) read (this error happens on a request validation level). Since this is not a network error, we will not use the new oplog fetcher restart strategy and instead fall back to the old behavior, which is to retry n (default 10) times. If those retries are exhausted, then initial sync fails.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              xuerui.fa Xuerui Fa
              Reporter:
              vesselina.ratcheva Vesselina Ratcheva
              Participants:
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: