remove shard failed: move chunk abort

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Done
    • Priority: Major - P3
    • None
    • Affects Version/s: 4.0.10
    • Component/s: Sharding
    • None
    • ALL
    • None
    • 3
    • None
    • None
    • None
    • None
    • None
    • None

      The second time call `RemoveShard` failed: move chunk abort.

      There're 4 shards in my sharding cluster: shard1, shard2, shard3, shard4. And there's several sharded collection distributed on these shards.

      At first, I remove shard2 successfully. Then, I call `removeShard` to remove shard4 but failed with `sh.status()` return always "draining" status.

      After I went through the shard4 log I found the error: "Chunk move failed :: caused by :: ShardNotFound: Shard2 not found". So I think the cache-route hasn't been updated since shard2 already removed.

      It can be reproduced when I run "_recvChunkStart" command on shard4, I attached the picture on the attachment.

        1. mongod.log.2020-04-30T00-15-05.tar.gz
          7.37 MB
          vinllen chen
        2. screenshot-1.png
          2.31 MB
          vinllen chen
        3. _recvChunkStart_failed.png
          762 kB
          vinllen chen
        4. move_chunk_fail.png
          924 kB
          vinllen chen
        5. sh.status.png
          314 kB
          vinllen chen

            Assignee:
            Carl Champain (Inactive)
            Reporter:
            vinllen chen
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: