Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-49158

Make the moveChunk helper retry the command when the migration is aborted due to stepdown

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.7.0
    • Component/s: Sharding
    • Backwards Compatibility:
      Fully Compatible
    • Operating System:
      ALL
    • Sprint:
      Sharding 2020-06-29
    • Linked BF Score:
      13

      Description

      For the concurrency stepdown suites, if the donor shard's primary steps down right after sending _recvChunkStart to the recipient and its primary in the next term aborts the migration during migration recovery and immediately starts another migration, when the recipient sends _migrationClone for the aborted migration to the latest donor's shard primary, the command will fail the session id validation. This would cause the moveChunk to fail with OperationFailed with the error message "Requested migration session id ... does not match active session id ..." since there is other active migration. Therefore, we need to make the moveChunk helper also retry the command on OperationFailed error with that error message.

        Attachments

          Activity

            People

            Assignee:
            cheahuychou.mao Cheahuychou Mao
            Reporter:
            cheahuychou.mao Cheahuychou Mao
            Participants:
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: