|
When the "fromMovePrimary" flag is true, CursorManager::invalidateAll() should set the Status for all cursors it kills to StaleDbVersion rather than QueryPlanKilled (which is what it currently sets).
Then, CursorManager::invalidateAll() with "fromMovePrimary=true" should be called when entering the movePrimary critical section. This will ensure that when the unsharded collections that were moved are dropped at the end of movePrimary, any yielded readers and writers on those unsharded collections throw StaleDbVersion (and so are retried against the new primary shard) rather than QueryPlanKilled (which would not cause them to be retried).
Note, we do not want to call CursorManager::invalidateAll() when entering the moveChunk critical section, because we do not want to kill yielded readers on sharded collections (CursorManager::invalidateAll() kills both readers and writers). This is because readers on a sharded collection hold a ScopedCollectionMetadata, which prevents the RangeDeleter from deleting the data out from under them.
Since we can't use CursorManager::invalidateAll() on entering the moveChunk critical section because we don't want to kill yielded readers, we will continue to call checkShardVersion() in OpObservers to cause yielded writers to throw StaleShardVersion if they resume after a moveChunk critical section is entered.
|