[SERVER-80559] Verify that the hang analyzer is functional again. Created: 30/Aug/23  Updated: 13/Sep/23  Resolved: 01/Sep/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Trevor Guidry Assignee: Trevor Guidry
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Participants:

 Description   

The hang analyzer broke and after some investigation it was found that this was due to an evergreen change that killed the resmoke python process before the timeout section started (See attached slack thread or evergreen ticket).

 

A change in evergreen is being made to fix this https://jira.mongodb.org/browse/EVG-20773 and it is being deployed tomorrow (Aug 31st). We need to verify that things are working correctly after this fix.



 Comments   
Comment by Trevor Guidry [ 01/Sep/23 ]

On Aug 31st the evergreen patch was deployed and reverted because due to an unrelated issue. Today it was redeployed and has not been reverted yet as of this message. We are now seeing core dumps again.

 

During the time it was active on Aug 31st we saw a very weird hang analyzer error that can be seen here https://evergreen.mongodb.com/task_log_raw/mongodb_mongo_master_nightly_windows_noPassthrough_4_windows_49a4b450450ee6e0dfc16e56b530d259fbce3c5b_23_08_31_02_48_58/0?type=T#L1570

 

The full error message is not showing and the cause is unknown. I did not see this in my testing but we should look out for it in the future.

Generated at Thu Feb 08 06:43:53 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.