Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-64939

Minimize shard split duration by sending a step up command to secondary

    XMLWordPrintableJSON

Details

    • Icon: Task Task
    • Resolution: Fixed
    • Icon: Major - P3 Major - P3
    • 6.1.0-rc0
    • None
    • None
    • None
    • Fully Compatible
    • Server Serverless 2022-05-02, Server Serverless 2022-05-16, Server Serverless 2022-05-30

    Description

      In order to minimize the duration of shard split we want to manually trigger an election to avoid waiting for the election timeout. The shard split service will send a `replSetStepUp` command to one of the nodes to ensure a primary will be elected as soon as possible. If the step up fails, it will select another node and send it again.

      One optimization to this method would be to disable replication at the same time for recipient node, to ensure they all have the same oplog and the replSetStepUp succeed. It was deemed too complicated for now and the idea was put aside (see Previous context for more info).

      Previous context :

      After SERVER-64935 we will send a replSetStepUp command to a random recipient node in order to run an immediate election. It's possible that this node will lose the election if its replication state is older than the other nodes, meaning we might need to retry the election against another node. In order to ensure that any selected recipient node is electable, we should pause replication on the recipient nodes at the same time which guarantees they have an equivalent replication state.

      We can use the split state document as this tombstone: if the state is kBlocking and the current node is tagged with recipientTagName, then pause replication on this node. Once a new primary is elected, reenable replication. Note, we may still need to clear the sync state to ensure that when replication is restarted, it's not started syncing from one of the donor nodes.

      Some additional benefits to this approach:

      • Recipient nodes will not need to perform replication rollback after the election
      • We will prevent unnecessary replication traffic for data that will be deleted during orphan cleanup after the split operation completes

      Attachments

        Activity

          People

            didier.nadeau@mongodb.com Didier Nadeau
            matt.broadstone@mongodb.com Matt Broadstone
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: