-
Type: Improvement
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: Testing Infrastructure
-
Fully Compatible
-
STM 2020-06-01, STM 2020-06-15, STM 2020-06-29, STM 2020-07-27, STM 2020-08-10, STM 2020-08-24
-
7
NOTE: This should be behind a flag. SERVER-46691 will remove it.
When a test or task times out in evergreen, resmoke will be sent a sigusr1 signal. The signal handler in resmoke will be modified to call the hang-analyzer on all tests that are still running. In the case of a test timeout, there should only be one, but in the case of a task timeout there can be multiple jobs.
Some complexity exists in determining the test pids. For tests started by resmoke fixtures, we can grab the fixture pids themselves. For tests using mongorunner to start processes, use process.children on the mongo shell process to get the list.
- depends on
-
SERVER-48705 resmoke.py sending SIGABRT to take core dumps on fixture teardown may overwrite core files from hang analyzer
- Closed
- duplicates
-
SERVER-37154 hang_analyzer should not run against mongo shell until after mongod and mongos processes
- Closed
- is depended on by
-
SERVER-48895 Complete TODO listed in SERVER-46691
- Closed
-
SERVER-46691 Rework the timeout task in evergreen.yml and ensure analysis & archival works
- Closed
- is duplicated by
-
SERVER-46820 Kill hung processes as the last step in resmoke's signal handler
- Closed
-
SERVER-48728 Complete TODO listed in SERVER-46691
- Closed
-
SERVER-46691 Rework the timeout task in evergreen.yml and ensure analysis & archival works
- Closed