[SERVER-64939] Minimize shard split duration by sending a step up command to secondary Created: 25/Mar/22  Updated: 29/Oct/23  Resolved: 16/May/22

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 6.1.0-rc0

Type: Task Priority: Major - P3
Reporter: Matt Broadstone Assignee: Didier Nadeau
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
is duplicated by SERVER-64935 Send replSetStepUp to a random recipi... Closed
Backwards Compatibility: Fully Compatible
Sprint: Server Serverless 2022-05-02, Server Serverless 2022-05-16, Server Serverless 2022-05-30
Participants:

 Description   

In order to minimize the duration of shard split we want to manually trigger an election to avoid waiting for the election timeout. The shard split service will send a `replSetStepUp` command to one of the nodes to ensure a primary will be elected as soon as possible. If the step up fails, it will select another node and send it again.

One optimization to this method would be to disable replication at the same time for recipient node, to ensure they all have the same oplog and the replSetStepUp succeed. It was deemed too complicated for now and the idea was put aside (see Previous context for more info).

Previous context :

After SERVER-64935 we will send a replSetStepUp command to a random recipient node in order to run an immediate election. It's possible that this node will lose the election if its replication state is older than the other nodes, meaning we might need to retry the election against another node. In order to ensure that any selected recipient node is electable, we should pause replication on the recipient nodes at the same time which guarantees they have an equivalent replication state.

We can use the split state document as this tombstone: if the state is kBlocking and the current node is tagged with recipientTagName, then pause replication on this node. Once a new primary is elected, reenable replication. Note, we may still need to clear the sync state to ensure that when replication is restarted, it's not started syncing from one of the donor nodes.

Some additional benefits to this approach:

  • Recipient nodes will not need to perform replication rollback after the election
  • We will prevent unnecessary replication traffic for data that will be deleted during orphan cleanup after the split operation completes


 Comments   
Comment by Githook User [ 13/May/22 ]

Author:

{'name': 'Didier Nadeau', 'email': 'didier.nadeau@mongodb.com', 'username': 'nadeaudi'}

Message: SERVER-64939 Send replSetStepUp command to recipient node
Branch: master
https://github.com/mongodb/mongo/commit/0592f87648a1fc2fdf4a51c96eb01d84330aad5b

Generated at Thu Feb 08 06:01:30 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.