[SERVER-66629] Convergence of the migration catch up phase when the user performs batched insertions/deletions Created: 21/May/22  Updated: 01/Feb/23  Resolved: 01/Feb/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Sergi Mateo Bellido Assignee: [DO NOT USE] Backlog - Sharding NYC
Resolution: Won't Do Votes: 0
Labels: sharding-nyc-subteam2
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
Assigned Teams:
Sharding NYC
Participants:
Linked BF Score: 134

 Description   

The goal of this task is to understand how batched insertions/deletions on the donor shard affect the migration catch up phase.

More specifically, recently we got two BFG-1172094, BFG-1172896 (part of BF-25179) that showed that under the presence of batched deletions the catch-up phase on the recipient didn't manage to converge with 37.5s.

Apart from that, it is interesting to point out that even if the migration is aborted, when we are applying the changes on the recipient we only check whether the migration was aborted between batches. In these two BFGs, there was just one batch but had 100K docs, so it took a lot of time to abort the operation.



 Comments   
Comment by Kshitij Gupta [ 01/Feb/23 ]

Discussed this with sergi.mateo-bellido@mongodb.com  offline, and with the recent work on parallelizing chunk migrations, this is less likely to happen. Closing this.

Generated at Thu Feb 08 06:05:59 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.