-
Type: Bug
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: Sharding
-
Fully Compatible
-
ALL
-
v6.1, v6.0, v5.0
-
Sharding 2022-08-08, Sharding 2022-08-22, Sharding 2022-09-05, Sharding 2022-09-19, Sharding 2022-10-03
-
60
-
4
Currently, the _configsvrReshardCollection relies on its own getExistingInstanceToJoin() function to determine if an existing ReshardingCoordinator is executing the same resharding operation. Because this function relies on getting and iterating existing instances of ReshardingCoordinator instead of checking atomically by overriding checkIfConflictsWithOtherInstances(), it's possible that if two identical _configsvrReshardCollection commands execute in quick succession (e.g. due to an election on the primary shard for the database) that they both see no existing coordinators and proceed to create one.
As seen in BF-23979, this will manifest as a Location5808201 error for the coordinator that loses the race to set allowMigrations to false.
- causes
-
SERVER-70746 _configsvrReshardCollection Will Not Join Existing Operations After Shard Key is Updated
- Closed
- related to
-
SERVER-78604 ReshardingCoordinatorService Index build deadlocks with OpObserver
- Backlog