We have two chunks on one shard, chunk A and chunk B. Chunk B is undergoing the clone phase of a migration. We have a document residing on chunk B. We decide to change the shard key of said document so that it would now reside on chunk A.
If we decide to change the shard key of a document inside chunk B, it will be modeled as an update operation. It will be modeled as an update operation because when two chunks are on the same shard, the shard key update is an atomic operation – there is no need for a transaction delete/insert pair.
However, changing the shard key will not be caught by the current set of migration op observers. As of the writing of this ticket, we only observe updates that fall within the range of the chunk being migrated. The post-image of the updated document will not match the chunk being moved. This means that the migration would effectively duplicate the document. We would have one copy with the new shard key on the original document. We would have another copy with the old shard key on the new shard.
We would like to observe both the pre- and post-image of the document on an update. If the pre-image fits within the chunk and the post-image does not (implying a shard key change), we will tell the migration to observe a delete of the document. That sounds scary at first glance, but it's not – if we are observing any update operation, that implies the update is being committed to storage on the source shard. Sending a delete of that document to the migration listener (and then destination shard) simply omits the document from the most up-to-date state of the moved chunk.
- What happens if the migration gets aborted and retries? If this happens, then the new migration will no longer see the document in its initial image of the migrating chunk. The migration will continue as normal.