Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-4766

Make initial sync restartable per collection

    • Type: Icon: Improvement Improvement
    • Resolution: Won't Do
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Replication
    • Labels:
    • Replication

      Be able to restart an initial sync node and it will only need to clone collections which haven't been completed.

      This will require ensuing that the oplog exists from the start of the cloning (from before the restart), and that no roll-back has occurred which would invalidate existing cloned data.

      Old Description
      Currently in initial sync, if the clone fails due to server crash or shutdown, we restart from scratch. It seems like it ought to be possible to record progress as we go so that we can pick up from wherever we left off. (For example, if the clone used the _id index and occasionally persisted the last written _id for each collection it visited, then it could pick up from the last _id seen. Reasoning about the minvalid oplog entry would remain unchanged, I believe.)

      Operationally, this would make getting out of certain stuck cases less irritating for users, e.g., if a fresh node never goes from RECOVERING to SECONDARY for some reason, they could at least know that if they restart the process, we'll try our best to minimize subsequent recovery time, rather than starting over.

            backlog-server-repl [DO NOT USE] Backlog - Replication Team
            richard.kreuter Richard Kreuter (Inactive)
            7 Vote for this issue
            7 Start watching this issue