[SERVER-49158] Make the moveChunk helper retry the command when the migration is aborted due to stepdown Created: 28/Jun/20  Updated: 29/Oct/23  Resolved: 30/Jun/20

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: 4.7.0

Type: Bug Priority: Major - P3
Reporter: Cheahuychou Mao Assignee: Cheahuychou Mao
Resolution: Fixed Votes: 0
Labels: sharding-wfbf-day
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Sharding 2020-06-29
Participants:
Linked BF Score: 13

 Description   

For the concurrency stepdown suites, if the donor shard's primary steps down right after sending _recvChunkStart to the recipient and its primary in the next term aborts the migration during migration recovery and immediately starts another migration, when the recipient sends _migrationClone for the aborted migration to the latest donor's shard primary, the command will fail the session id validation. This would cause the moveChunk to fail with OperationFailed with the error message "Requested migration session id ... does not match active session id ..." since there is other active migration. Therefore, we need to make the moveChunk helper also retry the command on OperationFailed error with that error message.



 Comments   
Comment by Githook User [ 30/Jun/20 ]

Author:

{'name': 'Cheahuychou Mao', 'email': 'cheahuychou.mao@mongodb.com', 'username': 'cheahuychou'}

Message: SERVER-49158 Make the moveChunk helper retry the command when the migration is aborted due to stepdown
Branch: master
https://github.com/mongodb/mongo/commit/0f1568965ee827870b845f26019e91c7d923879a

Generated at Thu Feb 08 05:19:05 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.