ShardsvrReshardCollection Can Hang If Stepdown Occurs Shortly After Stepping Up

XMLWordPrintableJSON

    • Sharding NYC
    • Fully Compatible
    • ALL
    • v7.1, v7.0, v6.0, v5.0
    • Sharding NYC 2023-09-18, Sharding NYC 2023-10-02
    • 12
    • 3
    • None
    • 3
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      The ShardsvrReshardCollection command does not flag the operation context to be interrupted during a stepdown, which is commonly done by other commands. This means that when calling getOrCreateInstance, it's possible to hang in the call to _waitForRecoveryCompletion when waiting for the state to reach kRecovered. After a stepdown, the state will be set to kPaused, so it's necessary that the operation context be interrupted at stepdown to avoid the hang.

      See this and this comment on BF-29457 for more information and an example of this happening.

              Assignee:
              Nandini Bhartiya
              Reporter:
              Brett Nawrocki
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                Resolved: