[SERVER-77513] Range deletion document removal commit must trigger in-memory state update Created: 26/May/23 Updated: 20/Nov/23 Resolved: 04/Jul/23 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | 6.2.1, 7.1.0-rc0, 6.3.1, 7.0.0-rc2 |
| Fix Version/s: | 7.1.0-rc0, 7.0.0-rc7 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Allison Easton | Assignee: | Pierlauro Sciarelli |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | auto-reverted | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||
| Assigned Teams: |
Sharding EMEA
|
||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||
| Backport Requested: |
v7.0
|
||||||||||||||||||||
| Sprint: | Sharding EMEA 2023-06-12, Sharding EMEA 2023-06-26, Sharding EMEA 2023-07-10 | ||||||||||||||||||||
| Participants: | |||||||||||||||||||||
| Linked BF Score: | 166 | ||||||||||||||||||||
| Description |
|
Full explanation in final comment of BF-27927. The range deleter's onInserts op observer schedules the update to happen on commit whereas the onDeletes op observer does the work right away. This is usually fine because there is sufficient time between inserting the range deletion document and deleting it that these do not overlap. However, in the presence of renames, the following scenario can occur:
In this case, the insertion by the rename thread of the document and the deletion of that document can happen immediately after one another (within less than a millisecond). In this case, the updates to the range deleter's in memory state can be ordered incorrectly as follows:
The range deleter will then do the deletion for the in-memory range deletion, but there is no task document to remove and so the in memory state will never be cleared (since it is only updated via onDelete). In this scenario, we now have in-memory state for a range deletion that will never be removed. This prevents all further range deletions for this range because they will join the existing in memory registration.
|
| Comments |
| Comment by Githook User [ 06/Jul/23 ] |
|
Author: {'name': 'Pierlauro Sciarelli', 'email': 'pierlauro.sciarelli@mongodb.com', 'username': 'pierlauro'}Message: |
| Comment by Githook User [ 03/Jul/23 ] |
|
Author: {'name': 'Pierlauro Sciarelli', 'email': 'pierlauro.sciarelli@mongodb.com', 'username': 'pierlauro'}Message: |
| Comment by Githook User [ 03/Jul/23 ] |
|
Author: {'name': 'auto-revert-processor', 'email': 'dev-prod-dag@mongodb.com', 'username': ''}Message: Revert " This reverts commit 866f81bb3016d4db13771a2be827f17cb6520ea1. |
| Comment by Githook User [ 03/Jul/23 ] |
|
Author: {'name': 'Pierlauro Sciarelli', 'email': 'pierlauro.sciarelli@mongodb.com', 'username': 'pierlauro'}Message: |
| Comment by xgen-buildbaron-user [ 01/Jul/23 ] |
|
Ticket re-opened due to revert. sharding_csrs_continuous_config_stepdown began a consistent failure of jstests/sharding/analyze_shard_key/analyze_shard_key_read_preference.js |
| Comment by Githook User [ 01/Jul/23 ] |
|
Author: {'name': 'auto-revert-processor', 'email': 'dev-prod-dag@mongodb.com', 'username': ''}Message: Revert " This reverts commit 866f81bb3016d4db13771a2be827f17cb6520ea1. |
| Comment by Githook User [ 30/Jun/23 ] |
|
Author: {'name': 'Pierlauro Sciarelli', 'email': 'pierlauro.sciarelli@mongodb.com', 'username': 'pierlauro'}Message: |
| Comment by Max Hirschhorn [ 26/May/23 ] |
The storage transaction could always roll back. The ticket description implies RangeDeleterServiceOpObserver::onDelete() is implemented as updating the in-memory state speculatively. (And it doesn't register an onRollback() handler to account for this.) I feel like the correct fix would be to have RangeDeleterServiceOpObserver::onDelete() register an onCommit() handler to update the in-memory state. The pattern we have typically follow in the codebase which empirically is less error prone is to "update on-disk state first, then update in-memory state second". |