[SERVER-49158] Make the moveChunk helper retry the command when the migration is aborted due to stepdown Created: 28/Jun/20 Updated: 29/Oct/23 Resolved: 30/Jun/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | None |
| Fix Version/s: | 4.7.0 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Cheahuychou Mao | Assignee: | Cheahuychou Mao |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | sharding-wfbf-day | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||
| Backwards Compatibility: | Fully Compatible | ||||
| Operating System: | ALL | ||||
| Sprint: | Sharding 2020-06-29 | ||||
| Participants: | |||||
| Linked BF Score: | 13 | ||||
| Description |
|
For the concurrency stepdown suites, if the donor shard's primary steps down right after sending _recvChunkStart to the recipient and its primary in the next term aborts the migration during migration recovery and immediately starts another migration, when the recipient sends _migrationClone for the aborted migration to the latest donor's shard primary, the command will fail the session id validation. This would cause the moveChunk to fail with OperationFailed with the error message "Requested migration session id ... does not match active session id ..." since there is other active migration. Therefore, we need to make the moveChunk helper also retry the command on OperationFailed error with that error message. |
| Comments |
| Comment by Githook User [ 30/Jun/20 ] |
|
Author: {'name': 'Cheahuychou Mao', 'email': 'cheahuychou.mao@mongodb.com', 'username': 'cheahuychou'}Message: |