-
Type: Bug
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: 6.2.1, 7.1.0-rc0, 6.3.1, 7.0.0-rc2
-
Component/s: Sharding
-
Sharding EMEA
-
Fully Compatible
-
ALL
-
v7.0
-
Sharding EMEA 2023-06-12, Sharding EMEA 2023-06-26, Sharding EMEA 2023-07-10
-
166
Full explanation in final comment of BF-27927.
The range deleter's onInserts op observer schedules the update to happen on commit whereas the onDeletes op observer does the work right away. This is usually fine because there is sufficient time between inserting the range deletion document and deleting it that these do not overlap. However, in the presence of renames, the following scenario can occur:
- Migration finishes and a range deletion task is marked as ready for processing, thus scheduling it on the range deleter
- The rename snapshots the range deletion task
- The range deletion happens and the deletions are replicated to secondaries (but the task document is still there)
- The rename inserts the snapshotted document into config.rangeDeletions
- The range deleter deletes all overlapping task documents for the range
In this case, the insertion by the rename thread of the document and the deletion of that document can happen immediately after one another (within less than a millisecond). In this case, the updates to the range deleter's in memory state can be ordered incorrectly as follows:
- The onInserts observer runs, scheduling the work of inserting the range deletion in memory for the onCommit observer
- The onDelete observer runs, removing the range deletion from in memory (which is a noop because it has not yet been added)
- The onCommit observer runs, inserting the range deletion task into memory
The range deleter will then do the deletion for the in-memory range deletion, but there is no task document to remove and so the in memory state will never be cleared (since it is only updated via onDelete).
In this scenario, we now have in-memory state for a range deletion that will never be removed. This prevents all further range deletions for this range because they will join the existing in memory registration.
A possible solution to this would be to register the range deletion task in the onInserts observer rather than scheduling it for onCommit.
- causes
-
SERVER-83454 Range Deleter Service registration and de-registration should not rely on onCommit ordering guarantees
- Closed
- is caused by
-
SERVER-67849 Implement a range deleter service observer
- Closed