-
Type: Task
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
Cluster Scalability
-
(copied to CRM)
-
3
Migration destination manager on the recipient starts fetching session information at the beginning of the move chunk process. This fetch happens on a separate thread. If SessionCatalogMigrationDestination fails due to any issues (e.g. Operation Interrupted) then we record the failure but we do not abort the chunk migration.
MigrationDestinationManager does eventually check the status of Session Migration and fails if the status is ErrorOccurred but this check is not done until the very end of chunk migration. So chunk migration won’t immediately fail even if session migration has failed.
This can cause an issue where a Chunk Migration can get stuck for 6 hours (timeout) because one of the conditions for the donor to engage the critical section is that session migration succeeded so the donor will keep waiting for 6 hours for the recipient to finish session migration while the recipient is waiting on the donor to engage the critical section. The donor will keep retrying until it times out in 6 hours.
- is caused by
-
SERVER-56185 Investigate possible improvements with session migration and a chunk migration's critical section
- Closed