[SERVER-68263] Do not remove blockers for aborted shard split when deleting the state document Created: 25/Jul/22  Updated: 29/Oct/23  Resolved: 28/Jul/22

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 6.1.0-rc0

Type: Task Priority: Major - P3
Reporter: Didier Nadeau Assignee: Didier Nadeau
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Backwards Compatibility: Fully Compatible
Sprint: Server Serverless 2022-08-08
Participants:

 Description   

The tenant access blockers are removed at three locations for shard split :

This opens up a race condition as split tries to remove a blocker for a tenant twice for aborted migration (when setting expireAt and when deleting the document). Therefore we can have the following scenario :

 

  1. Starting split for tenantA with id 1
  2. Blockers are installed for tenantA for split 1
  3. Split 1 aborts due to an error
  4. forgetShardSplit is called for split 1. It sets expireAt and remove blocker for tenantA
  5. A new split is started with id 2
  6. Blockers are installed for tenantA for split 2
  7. The state document is removed for split 1. The blocker for tenantA is removed in onDelete (this blocker is owned by split 2)
  8. Split 2 triggers an invariant as it expects to have a blocker for tenantA


 Comments   
Comment by Githook User [ 27/Jul/22 ]

Author:

{'name': 'Didier Nadeau', 'email': 'didier.nadeau@mongodb.com', 'username': 'nadeaudi'}

Message: SERVER-68263 Do not remove blockers when deleting an aborted shard split state document
Branch: master
https://github.com/mongodb/mongo/commit/0760d88c733011a7835b765368283e8dc2d5c144

Generated at Thu Feb 08 06:10:20 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.