Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-31922

Make the migration chunk cloner source resilient to stepdowns and network errors

    • Type: Icon: Improvement Improvement
    • Resolution: Won't Fix
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Sharding
    • Labels:
      None
    • Sharding
    • 20

      On the donor shard, the MigrationChunkClonerSourceLegacy::startClone code uses _callRecipient private class function to call the recipient, which then uses the task executor to make the call. The task executor does not retry NotMaster errors.

      A solution would be to use a ShardRemote, instead, and allow NotMaster errors to be retried for that first command, _recvChunkStart – don't want to use it for all commands, but the first one is safe, I think.

            Assignee:
            backlog-server-sharding [DO NOT USE] Backlog - Sharding Team
            Reporter:
            dianna.hohensee@mongodb.com Dianna Hohensee (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: