[SERVER-69890] Concurrent movePrimary and removeShard can move database to a no-longer existent shard Created: 22/Sep/22  Updated: 29/Oct/23  Resolved: 13/Dec/22

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 6.0.4, 6.2.0-rc5, 6.3.0-rc0

Type: Bug Priority: Major - P3
Reporter: Jordi Serra Torrens Assignee: Antonio Fuschetto
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Text File 0001-Repro-SERVER-69890.patch    
Issue Links:
Backports
Related
related to SERVER-68541 Concurrent removeShard and movePrimar... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v6.2, v6.0
Sprint: Sharding EMEA 2022-11-14, Sharding EMEA 2022-11-28, Sharding EMEA 2022-12-12, Sharding EMEA 2022-12-26
Participants:

 Description   

Consider the following interleaving:

  1. MovePrimary starts
  2. MovePrimary completes the cloning phase and is about to execute the commit phase.
  3. Concurrently, the user runs removeShard on the destination shard
  4. MovePrimary now commits the metadata change on the configsvr which writes the no-longer-existent destination shardId as the db primary for the moved database

As a result, the moved database becomes inaccessible and data may be lost if the removed shard is destroyed.



 Comments   
Comment by Githook User [ 09/Jan/23 ]

Author:

{'name': 'Antonio Fuschetto', 'email': 'antonio.fuschetto@mongodb.com', 'username': 'afuschetto'}

Message: SERVER-69890 Concurrent movePrimary and removeShard can move database to a no-longer existent shard
Branch: v6.2
https://github.com/mongodb/mongo/commit/a31c4c6c4bb1d0d066ca5edbe2345425b61a498b

Comment by Githook User [ 09/Jan/23 ]

Author:

{'name': 'Antonio Fuschetto', 'email': 'antonio.fuschetto@mongodb.com', 'username': 'afuschetto'}

Message: SERVER-69890 Concurrent movePrimary and removeShard can move database to a no-longer existent shard
Branch: v6.0
https://github.com/mongodb/mongo/commit/038e983c30733e1ba107acaf0c7edcd4f858ef22

Comment by Githook User [ 13/Dec/22 ]

Author:

{'name': 'Antonio Fuschetto', 'email': 'antonio.fuschetto@mongodb.com', 'username': 'afuschetto'}

Message: SERVER-69890 Concurrent movePrimary and removeShard can move database to a no-longer existent shard
Branch: master
https://github.com/mongodb/mongo/commit/df652c2e4909ffe88bfbdc938666c86b82a9b2b0

Generated at Thu Feb 08 06:14:43 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.