[SERVER-62854] ShardingCatalogManager::removeShard should prevent concurrent remove shard commits Created: 21/Jan/22  Updated: 23/Nov/23  Resolved: 18/Feb/22

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 5.0.5
Fix Version/s: 5.3.0-rc1, 5.0.19

Type: Bug Priority: Major - P3
Reporter: Allison Easton Assignee: Antonio Fuschetto
Resolution: Fixed Votes: 0
Labels: sharding-wfbf-day
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Problem/Incident
Related
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v5.0
Sprint: Sharding EMEA 2022-02-21
Participants:
Case:

 Description   

In ShardingCatalogManager::removeShard, the shard membership lock is released before the topology time update to the control shard. If two remove shard commands finish draining at the same time and choose the same control shard but commit their new topology times out of order, the new topology time will not be stored in any of the config.shards entries. This can cause the refresh of the shard registry to not be able to fulfill any promises (because the topology time returned from _lookup is read from config.shards and will be smaller than the time in store) and create an infinite loop of shard registry lookups.

We should hold the shard membership lock during these operations to eliminate the race condition, or at least ensure that the update of the topology time in the control shard is increasing the time, not decreasing it.



 Comments   
Comment by Githook User [ 26/Jun/23 ]

Author:

{'name': 'Antonio Fuschetto', 'email': 'antonio.fuschetto@mongodb.com', 'username': 'afuschetto'}

Message: SERVER-62854 ShardingCatalogManager::removeShard should prevent concurrent remove shard commits
Branch: v5.0
https://github.com/mongodb/mongo/commit/402b89c904450b918e9da0964c19562918c851e2

Comment by Githook User [ 18/Feb/22 ]

Author:

{'name': 'Antonio Fuschetto', 'email': 'antonio.fuschetto@mongodb.com', 'username': 'afuschetto'}

Message: SERVER-62854 ShardingCatalogManager::removeShard should prevent concurrent remove shard commits
Branch: master
https://github.com/mongodb/mongo/commit/981826ed4ec9f522cb5973cccccfcde96bf6bbd5

Generated at Thu Feb 08 05:56:14 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.