[SERVER-39940] Model a shard key update as a delete inside the chunk migration cloner if the document moves out of a currently-migrating chunk Created: 04/Mar/19  Updated: 29/Oct/23  Resolved: 24/Apr/19

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: 4.1.11

Type: Task Priority: Major - P3
Reporter: Blake Oler Assignee: Blake Oler
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
depends on SERVER-40350 Create a ReplOperation/Durable class ... Closed
is depended on by SERVER-39844 Create concurrency workload with migr... Closed
Problem/Incident
causes SERVER-68361 LogTransactionOperationsForShardingHa... Closed
Backwards Compatibility: Fully Compatible
Sprint: Sharding 2019-04-08, Sharding 2019-04-22, Sharding 2019-05-06
Participants:

 Description   

Scenario

We have two chunks on one shard, chunk A and chunk B. Chunk B is undergoing the clone phase of a migration. We have a document residing on chunk B. We decide to change the shard key of said document so that it would now reside on chunk A.

Problem

If we decide to change the shard key of a document inside chunk B, it will be modeled as an update operation. It will be modeled as an update operation because when two chunks are on the same shard, the shard key update is an atomic operation – there is no need for a transaction delete/insert pair.

However, changing the shard key will not be caught by the current set of migration op observers. As of the writing of this ticket, we only observe updates that fall within the range of the chunk being migrated. The post-image of the updated document will not match the chunk being moved. This means that the migration would effectively duplicate the document. We would have one copy with the new shard key on the original document. We would have another copy with the old shard key on the new shard.

Solution

We would like to observe both the pre- and post-image of the document on an update. If the pre-image fits within the chunk and the post-image does not (implying a shard key change), we will tell the migration to observe a delete of the document. That sounds scary at first glance, but it's not – if we are observing any update operation, that implies the update is being committed to storage on the source shard. Sending a delete of that document to the migration listener (and then destination shard) simply omits the document from the most up-to-date state of the moved chunk.

Questions/Answers.

  1. What happens if the migration gets aborted and retries? If this happens, then the new migration will no longer see the document in its initial image of the migrating chunk. The migration will continue as normal.


 Comments   
Comment by Githook User [ 24/Apr/19 ]

Author:

{'name': 'Blake Oler', 'username': 'BlakeIsBlake', 'email': 'blake.oler@mongodb.com'}

Message: SERVER-39940 Model a shard key update as a delete inside the chunk migration cloner if the document moves out of a currently-migrating chunk
Branch: master
https://github.com/mongodb/mongo/commit/cde931d2154150ca681e147d47921f51801de174

Comment by Kaloian Manassiev [ 05/Mar/19 ]

This approach LGTM. When implementing it, it would be good if this logic can entirely be encapsulated behind the MigrationChunkClonerSource::onUpdate method so we do all the shard key extraction and checking against the filtering metadata in one place. Also this will help ensure that this is happening only if there is migration and not impacting performance in the steady-state.

Comment by Randolph Tan [ 04/Mar/19 ]

lgtm

Comment by Blake Oler [ 04/Mar/19 ]

kaloian.manassiev renctan lgty?

Generated at Thu Feb 08 04:53:34 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.