[SERVER-31220] Resmoke can crash if an interrupt happens Created: 22/Sep/17  Updated: 22/Sep/17  Resolved: 22/Sep/17

Status: Closed
Project: Core Server
Component/s: Testing Infrastructure
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Ian Boros Assignee: DO NOT USE - Backlog - Test Infrastructure Group (TIG)
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates SERVER-30872 "List index out of range" error on te... Closed
Operating System: ALL
Participants:

 Description   

Short version:
If resmoke receives an interrupt while running a test, it can crash while trying to produce the summary of the test.

Long version:
If resmoke receives an interrupt while running a test, a report is added to the list self._reports in the handler (through record_test_end()):

https://github.com/mongodb/mongo/blob/master/buildscripts/resmokelib/testing/suite.py#L128-L143

Once the handler finishes, and control is passed back to the "normal" program, another report is added to self._reports:

https://github.com/mongodb/mongo/blob/master/buildscripts/resmokelib/testing/executor.py#L90-L99

This means the length of suite._reports is greater than the length of suite._start_times since suite.record_test_end() has been called once more than suite.record_test_start() has. The suite._summarize_repeated() function assumes these lists are the same length, and so will get a list index out of bounds error.

I think to fix this we should:
1) Add asserts in various places in suite.py that the length of _test_start_times is the same of _test_end_times and _reports (this isn't an assertion we can put anywhere since it's not always true. But it's usually true)
2) Don't call record_test_end if the test was interrupted.

Here's an example of this happening:
https://evergreen.mongodb.com/task/mongodb_mongo_master_enterprise_rhel_62_64_bit_jstestfuzz_sharded_continuous_stepdown_patch_9f8084f2c87cdbe5616cd5d6d8adb2a8272a504a_59c415362fbabe345b00002d_17_09_21_19_38_47

And the log:
https://evergreen.mongodb.com/task_log_raw/mongodb_mongo_master_enterprise_rhel_62_64_bit_jstestfuzz_sharded_continuous_stepdown_patch_9f8084f2c87cdbe5616cd5d6d8adb2a8272a504a_59c415362fbabe345b00002d_17_09_21_19_38_47/0?type=T&text=true

(Ctrl-f for summary = self._summarize_execution(iteration, bulleter_sb) to find the line with the relevant stacktrace)



 Comments   
Comment by Ian Boros [ 22/Sep/17 ]

Closing since I realized this is a duplicate of:

https://jira.mongodb.org/browse/SERVER-30872

Generated at Thu Feb 08 04:26:21 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.