[SERVER-60142] Shard can migrate on top of orphans after filtering metadata was cleared Created: 22/Sep/21  Updated: 29/Oct/23  Resolved: 23/Sep/21

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 4.4.8, 5.0.2
Fix Version/s: 4.4.10, 5.0.4, 5.1.0-rc0

Type: Bug Priority: Major - P3
Reporter: Jordi Serra Torrens Assignee: Jordi Serra Torrens
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
backported by SERVER-66433 Backport deadline waiting for overlap... Closed
Depends
Duplicate
Problem/Incident
is caused by SERVER-52906 moveChunk after failed migration that... Closed
Related
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v5.0, v4.4
Participants:
Linked BF Score: 74

 Description   

A migration recipient may start cloning data that overlaps with an ongoing range deletion if the filtering metadata was cleared before starting receiving the migration.

1. Consider we have an ongoing range deletion (e.g we donated a chunk).
2. For whatever reason (e.g. a failed metadata refresh), the filtering metadata gets cleared.
3. Now we start receiving a chunk that overlaps that range deletion (i.e. the same chunk we recently donated away)
4. MigrationDestinationManager will see that there is an existing overlapping range deletion document, so it will attempt to wait for the rangeDeletion task to finish through the CSR. However, because the metadata was cleared on step (2), the current metadata is not aware of that range deletion. So 'waitForClean' will return OK right away.
5. So MigrationDestinationManager will begin cloning documents, even though the range deletion is ongoing and may delete them. Thus causing data loss.

This regression was introduced on SERVER-52906 because this 'while' was changed to an 'if'



 Comments   
Comment by Vivian Ge (Inactive) [ 06/Oct/21 ]

Updating the fixversion since branching activities occurred yesterday. This ticket will be in rc0 when it’s been triggered. For more active release information, please keep an eye on #server-release. Thank you!

Comment by Githook User [ 28/Sep/21 ]

Author:

{'name': 'Jordi Serra Torrens', 'email': 'jordi.serra-torrens@mongodb.com', 'username': 'jordist'}

Message: SERVER-60142 Fix MigrationDestinationManager's check for overlapping rangeDeletion tasks
Branch: v4.4
https://github.com/mongodb/mongo/commit/77e19ae35b3ed0ddcf260103df4d61b53941b27c

Comment by Githook User [ 23/Sep/21 ]

Author:

{'name': 'Jordi Serra Torrens', 'email': 'jordi.serra-torrens@mongodb.com', 'username': 'jordist'}

Message: SERVER-60142 Fix MigrationDestinationManager's check for overlapping rangeDeletion tasks
Branch: v5.0
https://github.com/mongodb/mongo/commit/dfff7baa761d2a32ce82f8902659819905bd0a57

Comment by Githook User [ 23/Sep/21 ]

Author:

{'name': 'Jordi Serra Torrens', 'email': 'jordi.serra-torrens@mongodb.com', 'username': 'jordist'}

Message: SERVER-60142 Fix MigrationDestinationManager's check for overlapping rangeDeletion tasks
Branch: master
https://github.com/mongodb/mongo/commit/92738c5fa0e8169299e5393e88159b8cbb9559ca

Generated at Thu Feb 08 05:49:04 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.