Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-56762

FailPoint::setMode() could be blocked for hours, making tests to time out

    XMLWordPrintableJSON

Details

    • Icon: Bug Bug
    • Resolution: Works as Designed
    • Icon: Major - P3 Major - P3
    • None
    • None
    • None
    • None
    • Service Arch
    • ALL

    Description

      We have a use case to make the fail point to suspend a thread for over an hour. However the loop in FailPoint::setMode() will block until the fail point block exits.

      Repro: while running this test the loop did 42,095 iterations and the test eventually times out. The fail point was configured to block the Hello command processing thread for 100 minutes.

      I think the purpose of this spin wait is to prevent test flakiness by eliminating subtle races. For this purpose the infinite wait is not necessary. I propose to limit the wait for 1 minute, this is long enough for all race cases.

      This is a blocker for submitting integration tests for HELP ticket, so I will propose a fix.

      Attachments

        Activity

          People

            backlog-server-servicearch Backlog - Service Architecture
            andrew.shuvalov@mongodb.com Andrew Shuvalov (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: