|
The ReshardingCoordinator should retrieve the donor shards for the resharding operation only once allowMigrations:false is set and/or there are some locks acquired to prevent concurrent moveChunks from occurring / succeeding in the meanwhile.
In the current code, it is possible that the ReshardingCoordinator could have an incorrect list of donors and recipients.
Consider the following scenario:
- suppose we have a failpoint set to be paused in reshardCollectionCmd at this line here, after we get donor/recipients from the chunkManager, but before we even create the ReshardingCoordinatorService instance, let alone set allowMigrations:false on the original collection.
- a moveChunk comes in and succeeds
- we unpause the failpoint, reshardCollection begins and tries to run with donor shards who no longer actually own the data
|