FailPoint::setMode() could be blocked for hours, making tests to time out

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Works as Designed
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • Service Arch
    • ALL
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      We have a use case to make the fail point to suspend a thread for over an hour. However the loop in FailPoint::setMode() will block until the fail point block exits.

      Repro: while running this test the loop did 42,095 iterations and the test eventually times out. The fail point was configured to block the Hello command processing thread for 100 minutes.

      I think the purpose of this spin wait is to prevent test flakiness by eliminating subtle races. For this purpose the infinite wait is not necessary. I propose to limit the wait for 1 minute, this is long enough for all race cases.

      This is a blocker for submitting integration tests for HELP ticket, so I will propose a fix.

            Assignee:
            [DO NOT USE] Backlog - Service Architecture
            Reporter:
            Andrew Shuvalov (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: