[SERVER-60820] Hang analyzer fails to run in Evergreen Created: 19/Oct/21  Updated: 29/Oct/23  Resolved: 20/Oct/21

Status: Closed
Project: Core Server
Component/s: Testing Infrastructure
Affects Version/s: None
Fix Version/s: 5.1.2, 5.2.0-rc0

Type: Bug Priority: Critical - P2
Reporter: Max Hirschhorn Assignee: Richard Samuels (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Related
is related to SERVER-59670 Hang analyzer not using Evergreen cre... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v5.1
Sprint: STM 2021-11-01
Participants:
Story Points: 0

 Description   

There's an error in the exception handling too so I don't know the true reason we couldn't get the Evergreen credentials.

[2021/10/19 14:35:13.961] Traceback (most recent call last):
[2021/10/19 14:35:13.961]   File "/data/mci/0e0485c0847ee669d94ad88fcf5941cf/src/buildscripts/resmokelib/run/__init__.py", line 306, in _execute_suite
[2021/10/19 14:35:13.961]     executor.run()
[2021/10/19 14:35:13.961]   File "/data/mci/0e0485c0847ee669d94ad88fcf5941cf/src/buildscripts/resmokelib/testing/executor.py", line 120, in run
[2021/10/19 14:35:13.961]     (report, interrupted) = self._run_tests(test_queue, setup_flag, teardown_flag)
[2021/10/19 14:35:13.961]   File "/data/mci/0e0485c0847ee669d94ad88fcf5941cf/src/buildscripts/resmokelib/testing/executor.py", line 209, in _run_tests
[2021/10/19 14:35:13.961]     thr.join()
[2021/10/19 14:35:13.961]   File "/opt/mongodbtoolchain/revisions/4ac427ffa2fb12ffce7028023dae1775a06e9bf5/stow/python3-v3.8gt/lib/python3.9/threading.py", line 1033, in join
[2021/10/19 14:35:13.961]     self._wait_for_tstate_lock()
[2021/10/19 14:35:13.961]   File "/opt/mongodbtoolchain/revisions/4ac427ffa2fb12ffce7028023dae1775a06e9bf5/stow/python3-v3.8gt/lib/python3.9/threading.py", line 1049, in _wait_for_tstate_lock
[2021/10/19 14:35:13.961]     elif lock.acquire(block, timeout):
[2021/10/19 14:35:13.961]   File "/data/mci/0e0485c0847ee669d94ad88fcf5941cf/src/buildscripts/resmokelib/sighandler.py", line 37, in _handle_sigusr1
[2021/10/19 14:35:13.961]     _dump_and_log(header_msg)
[2021/10/19 14:35:13.961]   File "/data/mci/0e0485c0847ee669d94ad88fcf5941cf/src/buildscripts/resmokelib/sighandler.py", line 74, in _dump_and_log
[2021/10/19 14:35:13.961]     _analyze_pids(logger, pids_to_analyze)
[2021/10/19 14:35:13.961]   File "/data/mci/0e0485c0847ee669d94ad88fcf5941cf/src/buildscripts/resmokelib/sighandler.py", line 162, in _analyze_pids
[2021/10/19 14:35:13.961]     _hang_analyzer.execute()
[2021/10/19 14:35:13.961]   File "/data/mci/0e0485c0847ee669d94ad88fcf5941cf/src/buildscripts/resmokelib/hang_analyzer/hang_analyzer.py", line 102, in execute
[2021/10/19 14:35:13.961]     my_symbolizer = Symbolizer(self.task_id, download_symbols_only=True)
[2021/10/19 14:35:13.961]   File "/data/mci/0e0485c0847ee669d94ad88fcf5941cf/src/buildscripts/resmokelib/symbolizer/__init__.py", line 53, in __init__
[2021/10/19 14:35:13.961]     self.evg_api: evergreen_conn.RetryingEvergreenApi = evergreen_conn.get_evergreen_api()
[2021/10/19 14:35:13.961]   File "/data/mci/0e0485c0847ee669d94ad88fcf5941cf/src/buildscripts/resmokelib/utils/evergreen_conn.py", line 74, in get_evergreen_api
[2021/10/19 14:35:13.961]     raise last_ex
[2021/10/19 14:35:13.961] TypeError: exceptions must derive from BaseException

https://evergreen.mongodb.com/lobster/evergreen/task/mongodb_mongo_master_enterprise_rhel_80_64_bit_dynamic_required_burn_in:concurrency_sharded_with_stepdowns_0_enterprise_rhel_80_64_bit_dynamic_required_patch_40b6c60df9c863a1f473287fcc139f71dcd2954a_616ebeda3066155f964eef82_21_10_19_12_50_17/0/task#bookmarks=0%2C1708&l=1&shareLine=1486



 Comments   
Comment by Richard Samuels (Inactive) [ 20/Oct/21 ]

I suspect that this error only occurred in burn_in_tags because we don't setup an evergreen credential file for all generated burn in tasks.

The attached change has fixed the exception handling and modified the hang analyzer to always set up the evergreen credential file before running the hang analyzer. Between this change, and the previous one, this problem should finally disappear

Comment by Githook User [ 20/Oct/21 ]

Author:

{'name': 'Richard Samuels', 'email': 'richard.l.samuels@gmail.com', 'username': 'richardsamuels'}

Message: SERVER-60820 Hang analyzer require evergreen credentials file to be written to task directory
Branch: master
https://github.com/mongodb/mongo/commit/fd61e7e863a05f354e0f9ffce14ea0578edbd642

Comment by Richard Samuels (Inactive) [ 19/Oct/21 ]

The hang analyzer now looks at multiple directories, checks if there is a .evergreen.yml file, and if it exists, tries to use it to contact evergreen. 

If _find_evergreen_yaml_candidates() finds no .evergreen.yml files, the hang analyzer will throw None, which is invalid.

Further question: why did the hang analyzer not find a single .evergreen.yml file, not even the broken one in the home directory?

Generated at Thu Feb 08 05:50:49 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.