[SERVER-80559] Verify that the hang analyzer is functional again. Created: 30/Aug/23 Updated: 13/Sep/23 Resolved: 01/Sep/23 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | Trevor Guidry | Assignee: | Trevor Guidry |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||
| Participants: | |||||
| Description |
|
The hang analyzer broke and after some investigation it was found that this was due to an evergreen change that killed the resmoke python process before the timeout section started (See attached slack thread or evergreen ticket).
A change in evergreen is being made to fix this https://jira.mongodb.org/browse/EVG-20773 and it is being deployed tomorrow (Aug 31st). We need to verify that things are working correctly after this fix. |
| Comments |
| Comment by Trevor Guidry [ 01/Sep/23 ] |
|
On Aug 31st the evergreen patch was deployed and reverted because due to an unrelated issue. Today it was redeployed and has not been reverted yet as of this message. We are now seeing core dumps again.
During the time it was active on Aug 31st we saw a very weird hang analyzer error that can be seen here https://evergreen.mongodb.com/task_log_raw/mongodb_mongo_master_nightly_windows_noPassthrough_4_windows_49a4b450450ee6e0dfc16e56b530d259fbce3c5b_23_08_31_02_48_58/0?type=T#L1570
The full error message is not showing and the cause is unknown. I did not see this in my testing but we should look out for it in the future. |