Resharding recipient service can have an infinite retry loop for retryable error

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Duplicate
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • Cluster Scalability
    • ALL
    • 200
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      BF-40304 is happening in resharding_recipient_service.cpp because after the transient error, resharding will hit an until while will exit once the status is ok. The onError which would take check the abort happens after the retry loop, and the .on() is only checked after the loop body finishes. As a result, we never actually check in the retry loop if we should abort due to abort or step down. We should add a custom predicate so that if we retry on a retriable error, resharding recipient service can exit gracefully. 

            Assignee:
            David Chen
            Reporter:
            David Chen
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: