Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Fixed
Priority: Major - P3
Fix Version/s: 6.2.0-rc0
Affects Version/s: None
Component/s: Sharding
Labels:
- PM-2144-Milestone-0

Assigned Teams:

Sharding EMEA
Backwards Compatibility:
Fully Compatible
Operating System:
ALL
Backport Requested:

v4.4
Sprint:
Sharding 2020-04-06, Sharding 2020-04-20, Sharding 2020-05-04, Sharding 2020-05-18, Sharding 2020-07-13, Sharding 2020-06-01, Sharding 2020-06-15, Sharding 2020-06-29, Sharding 2020-07-27, Sharding 2020-08-24
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

When the resumable range deleter is disabled, the recipient of a chunk starts by removing potentially orphaned documents. After that, it clones necessary indexes from the donor.

However, the range deleter relies on the shard key index in order to perform deletions.

This can lead to the following scenario:
1. A moveChunk begins
2. The shard key is refined
3. The moveChunk fails on the recipient for some reason, causing the entire moveChunk to fail
4. The moveChunk is restarted, now with a refined shard key
5. The recipient of the moveChunk attempts to delete the incoming range using the range deleter with the refined shard key
6. The range deleter loops infinitely because it is unable to find a shard key index.

There may be less convoluted scenarios that could cause this as well but I'm having trouble thinking of one.

Repro attached.

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

range_deleter_refine_missing_index_repro.js
3 kB
Mar 20 2020 07:46:22 PM UTC

depends on

SERVER-69768 Include key pattern in range deletion task documents

Closed

is related to

SERVER-52906 moveChunk after failed migration that rolled back cloning indexes can hang indefinitely due to missing shard key index

Closed

related to

SERVER-79632 Stop range deletion when hashed shard key index does not exist

Closed

Assignee:: [DO NOT USE] Backlog - Sharding EMEA
Reporter:: Matthew Saltz (Inactive)
Participants:: [DO NOT USE] Backlog - Sharding EMEA, Blake Oler, Esha Maharishi, Jordi Serra Torrens, Matthew Saltz
Votes:: 0 Vote for this issue
Watchers:: 12 Start watching this issue

Created:: Mar 20 2020 07:52:51 PM UTC
Updated:: Oct 29 2023 10:10:29 PM UTC
Resolved:: Jul 04 2023 10:45:03 AM UTC
Confidence Status Last Update:: 26/Mar/20 5:57 PM

Details

Description

Attachments

Attachments

Issue Links

Forms

Activity

People

Dates