Resharding services need to handle failures after state transitions

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Cluster Scalability
    • ALL
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      A lot of resharding functions follow a pattern of

      ExecutorFuture<void> someWork() {
        if (_state >= FooState) {
          return ExecutorFuture<void>(**executor, Status::OK());
        }
        ...
        transitionState(FooState);
      }

      If we retry we will not re-run someWork if we transitioned the state.

      We have identified at least one case where a function does more work after we transition the state. We need to update any cases like this to either directly retry the work done after transitioning state in these functions, or refactor so no work is done after the state transition in these functions.

            Assignee:
            Unassigned
            Reporter:
            Ben Gawel
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated: