Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-47025

moveChunk after refine shard key can hang indefinitely due to missing shard key index

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major - P3
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: 4.4 Required, 5.0 Required
    • Component/s: Sharding
    • Labels:
      None
    • Operating System:
      ALL
    • Backport Requested:
      v4.4
    • Sprint:
      Sharding 2020-04-06, Sharding 2020-04-20, Sharding 2020-05-04, Sharding 2020-05-18, Sharding 2020-07-13, Sharding 2020-06-01, Sharding 2020-06-15, Sharding 2020-06-29, Sharding 2020-07-27, Sharding 2020-08-24
    • Linked BF Score:
      17

      Description

      When the resumable range deleter is disabled, the recipient of a chunk starts by removing potentially orphaned documents. After that, it clones necessary indexes from the donor.

      However, the range deleter relies on the shard key index in order to perform deletions.

      This can lead to the following scenario:
      1. A moveChunk begins
      2. The shard key is refined
      3. The moveChunk fails on the recipient for some reason, causing the entire moveChunk to fail
      4. The moveChunk is restarted, now with a refined shard key
      5. The recipient of the moveChunk attempts to delete the incoming range using the range deleter with the refined shard key
      6. The range deleter loops infinitely because it is unable to find a shard key index.

      There may be less convoluted scenarios that could cause this as well but I'm having trouble thinking of one.

      Repro attached.

        Attachments

          Activity

            People

            Assignee:
            esha.maharishi Esha Maharishi
            Reporter:
            matthew.saltz Matthew Saltz
            Participants:
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Dates

              Created:
              Updated: