[SERVER-65969] Migration completion must not be signaled before releasing the ActiveMigrationRegistry Created: 26/Apr/22 Updated: 29/Oct/23 Resolved: 17/May/22 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 6.0.0-rc8, 6.1.0-rc0 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Pierlauro Sciarelli | Assignee: | Paolo Polato |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||
| Backwards Compatibility: | Fully Compatible | ||||
| Operating System: | ALL | ||||
| Backport Requested: |
v6.0
|
||||
| Sprint: | Sharding EMEA 2022-05-16, Sharding EMEA 2022-05-30 | ||||
| Participants: | |||||
| Description |
|
This may be a very rare race condition, but it's worth mentioning it since it has required a lot of investigation on a failing test in a patch. It can happen if the CSRS steps down during any tests issuing 2 subsequent moveChunk commands on different ranges (e.g. here). When a _shardsvrMoveRange command (moveChunk in previous versions) is joining an ongoing migration, it waits for the completion of the original migration that is signaled before releasing the ActiveMigrationRegistry. As a result, the following flow could be reproduced:
|
| Comments |
| Comment by Githook User [ 29/May/22 ] |
|
Author: {'name': 'Paolo Polato', 'email': 'paolo.polato@mongodb.com', 'username': 'ppolato'}Message: (cherry picked from commit bfee7c7eaef29fef0a1ec443d0527e335c18d756) |
| Comment by Githook User [ 17/May/22 ] |
|
Author: {'name': 'Paolo Polato', 'email': 'paolo.polato@mongodb.com', 'username': 'ppolato'}Message: |