Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-4766

Make initial sync restartable per collection

    XMLWordPrintable

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Won't Do
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Replication
    • Labels:

      Description

      Be able to restart an initial sync node and it will only need to clone collections which haven't been completed.

      This will require ensuing that the oplog exists from the start of the cloning (from before the restart), and that no roll-back has occurred which would invalidate existing cloned data.

      Old Description
      Currently in initial sync, if the clone fails due to server crash or shutdown, we restart from scratch. It seems like it ought to be possible to record progress as we go so that we can pick up from wherever we left off. (For example, if the clone used the _id index and occasionally persisted the last written _id for each collection it visited, then it could pick up from the last _id seen. Reasoning about the minvalid oplog entry would remain unchanged, I believe.)

      Operationally, this would make getting out of certain stuck cases less irritating for users, e.g., if a fresh node never goes from RECOVERING to SECONDARY for some reason, they could at least know that if they restart the process, we'll try our best to minimize subsequent recovery time, rather than starting over.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              backlog-server-repl Backlog - Replication Team
              Reporter:
              richard.kreuter Richard Kreuter
              Participants:
              Votes:
              7 Vote for this issue
              Watchers:
              9 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: