Loading...

XML

Word

Printable

JSON

Type: Task
Resolution: Fixed
Priority: Major - P3
Fix Version/s: 5.3.0
Affects Version/s: None
Component/s: Sharding
Labels:
None

Backwards Compatibility:
Fully Compatible
Sprint:
Sharding EMEA 2022-01-10, Sharding EMEA 2022-01-24
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

Currently, in case of some errors during a migration (or migration recovery), the donor shard clears it's filtering metadata so that the migration will be recovered the next time a query attempts to use that collection. Some code paths trigger a best-effort recovery, while others don't. Even in the case of the best-effort attempt, it could fail to recover. This is correct, but with the new migration protocol (where the recipient takes the critical section) it may cause long periods of time where the recipient is holding both the critical section (causing collection unavailability) and also holding the ActiveMigrationRegistry (making the recipient shard unable to donate/receive chunks related to any other collection).

This ticket is to evaluate making sure that the migration recovery is retried until success.

Assignee:: Antonio Fuschetto
Reporter:: Jordi Serra Torrens
Participants:: Antonio Fuschetto, Githook User, Jordi Serra Torrens
Votes:: 0 Vote for this issue
Watchers:: 3 Start watching this issue

Created:: Dec 28 2021 03:35:44 PM UTC
Updated:: Oct 29 2023 09:44:40 PM UTC
Resolved:: Jan 18 2022 09:42:45 AM UTC
Confidence Status Last Update:: 04/Jan/22 12:05 PM

Details

Description

Attachments

Activity

People

Dates