Consider the following interleaving:
1. [th1] Starts a multi write operation on an unsharded collection and passes the dbVersion check successfully.
2. [th1] The multi write yields.
3. [th2] A movePrimary starts and sets the 'move primary in progress' flag on the DatabaseShardingState.
4. [th2] MovePrimary commits.
5. [th2] MovePrimary unsets the 'move primary in progress'
6. [th2] MovePrimary hangs before dropping the old collection from the former primary.
7. [th1] The multi write now resumes from the yield. Note how the writes never recheck the dbVersion on the op_observers (like we do for the 'shardVersion'). Therefore, the writes don't fail, but are lost because they happened on the old db-primary shard after the ownership change had already committed.
Note that on the op_observer we fail writes if there's a move primary operation in progress. This is what typically prevents losing writes. However, this is only hit if the write restores from yield while the movePrimary is still in progress. In the interleaving above, this does not happen. I don't think this interleaving is very likely, since it requires that the write was yielded for a long time (from the moment the cloning started until after the commit happened).
- causes
-
SERVER-80463 MigrationChunkClonerSourceOpObserver::onInserts() written to look like it skips checking some documents for whether their chunk has moved
- Closed