Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-31220

Resmoke can crash if an interrupt happens

    • Type: Icon: Bug Bug
    • Resolution: Duplicate
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Testing Infrastructure
    • None
    • ALL

      Short version:
      If resmoke receives an interrupt while running a test, it can crash while trying to produce the summary of the test.

      Long version:
      If resmoke receives an interrupt while running a test, a report is added to the list self._reports in the handler (through record_test_end()):

      https://github.com/mongodb/mongo/blob/master/buildscripts/resmokelib/testing/suite.py#L128-L143

      Once the handler finishes, and control is passed back to the "normal" program, another report is added to self._reports:

      https://github.com/mongodb/mongo/blob/master/buildscripts/resmokelib/testing/executor.py#L90-L99

      This means the length of suite._reports is greater than the length of suite._start_times since suite.record_test_end() has been called once more than suite.record_test_start() has. The suite._summarize_repeated() function assumes these lists are the same length, and so will get a list index out of bounds error.

      I think to fix this we should:
      1) Add asserts in various places in suite.py that the length of _test_start_times is the same of _test_end_times and _reports (this isn't an assertion we can put anywhere since it's not always true. But it's usually true)
      2) Don't call record_test_end if the test was interrupted.

      Here's an example of this happening:
      https://evergreen.mongodb.com/task/mongodb_mongo_master_enterprise_rhel_62_64_bit_jstestfuzz_sharded_continuous_stepdown_patch_9f8084f2c87cdbe5616cd5d6d8adb2a8272a504a_59c415362fbabe345b00002d_17_09_21_19_38_47

      And the log:
      https://evergreen.mongodb.com/task_log_raw/mongodb_mongo_master_enterprise_rhel_62_64_bit_jstestfuzz_sharded_continuous_stepdown_patch_9f8084f2c87cdbe5616cd5d6d8adb2a8272a504a_59c415362fbabe345b00002d_17_09_21_19_38_47/0?type=T&text=true

      (Ctrl-f for summary = self._summarize_execution(iteration, bulleter_sb) to find the line with the relevant stacktrace)

            Assignee:
            backlog-server-tig DO NOT USE - Backlog - Test Infrastructure Group (TIG)
            Reporter:
            ian.boros@mongodb.com Ian Boros
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: