[SERVER-46395] Resumable range deleter can submit duplicate range deletion tasks on step up Created: 25/Feb/20 Updated: 29/Oct/23 Resolved: 13/Mar/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | None |
| Fix Version/s: | 4.4.0-rc0, 4.7.0 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Max Hirschhorn | Assignee: | Max Hirschhorn |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Backport Requested: |
v4.4
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Steps To Reproduce: | Apply the following patch build introduces an invariant into MetadataManager::_submitRangeForDeletion() to check for range deletion tasks on the same chunk range and run the following resmoke.py invocation.
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Sprint: | Sharding 2020-03-09, Sharding 2020-03-23 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Linked BF Score: | 0 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Description |
|
If the replica set shard primary steps down before it has processed a range deletion task from a chunk moving off the shard and very shortly after steps back up, then it is possible for multiple range deletion tasks to be scheduled and eventually run. This is problematic because the first task completing will remove the document from config.rangeDeletions collection. Subsequent checks for whether the deletion of a range is scheduled are based on the contents of the config.rangeDeletions collection (rather than the in-memory state) and would mean a chunk migration for an overlapping range could be accepted by the shard. The range deleter would then be able to remove documents that are actually owned by the shard. Credit to esha.maharishi for finding this issue. |
| Comments |
| Comment by Githook User [ 28/Mar/20 ] |
|
Author: {'name': 'Max Hirschhorn', 'username': 'visemet', 'email': 'max.hirschhorn@mongodb.com'}Message: Changes the range deletion task to guarantee a batch of documents can Changes the deletion of the range deletion task document to use the _id (cherry picked from commit 90eefa051e6015514dcc6256d0f42b76bf041a76) |
| Comment by Githook User [ 12/Mar/20 ] |
|
Author: {'name': 'Max Hirschhorn', 'username': 'visemet', 'email': 'max.hirschhorn@mongodb.com'}Message: |
| Comment by Githook User [ 11/Mar/20 ] |
|
Author: {'name': 'Max Hirschhorn', 'username': 'visemet', 'email': 'max.hirschhorn@mongodb.com'}Message: Changes the range deletion task to guarantee a batch of documents can Changes the deletion of the range deletion task document to use the _id |
| Comment by Githook User [ 02/Mar/20 ] |
|
Author: {'username': 'visemet', 'name': 'Max Hirschhorn', 'email': 'max.hirschhorn@mongodb.com'}Message: Introduces a joinAsync() method to the TaskExecutor interface to provide |