|
POC 0001-SERVER-58416-POC.patch :
- Recipient enters the critical section (after the donor has entered their critical section and the recipient has finished cloning the documents)
- Then donor commits the chunk migration to the configsvr.
- Finally, donor refreshes its filtering metadata (as it already did), and donor issues a new command to the recipient so that it refreshes its filtering metadata and releases the critical section.
The POC only considers the happy path. The recovery has not been dealt with.
As an improvement, the filtering info refresh on the donor and on the recipient can be parallelized to happen concurrently.
The POC above makes that both donor and recipient hold the critical section during the commit of the migration on the configsvr, and refresh before exiting the critical section. As a result, both donor and recipient are always aware of which chunks they own (outside of their critical section).
This would allow shards to use their current filtering information to avoid making writes to orphaned documents.
In this POC, the updates to the donor's and recipient's config.rangeDeletions collection has not been included to happen in the criticalSection. This is not strictly necessary if we don't use config.rangeDeletions in order to figure the ownership for reads/writes filtering. If we decided to use config.rangeDeletions for that purpose, the updates would need to happen within the criticalSection. Moreover, the updates on donor and recipient should become visible at the same clusterTime, which must also be the same time as the validAfter stored in the chunk history (which is used for routing). This may be achieved by:
1. The donor reserves an oplog slot at time T
2. The donor commits the chunk migration to the configsvr with
{validAfter: T}
in the history
3. The donor commits a multi-shard txn at the reserved oplog time T on both donor and recipient, with the updates on config.rangeDeletions.
|