[SERVER-77887] Investigate timeout in running fcv upgrade/downgrade test for TSAN variant Created: 07/Jun/23  Updated: 12/Dec/23

Status: Open
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Adi Zaimi Assignee: Backlog - Cluster Scalability
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Cluster Scalability
Sprint: Sharding NYC 2023-06-12, Sharding NYC 2023-06-26, Sharding NYC 2023-07-10, Sharding NYC 2023-07-24, Sharding NYC 2023-08-07, Sharding NYC 2023-08-21, Sharding NYC 2023-09-04, Sharding NYC 2023-09-18, Sharding NYC 2023-10-02, Sharding NYC 2023-10-16
Participants:

 Description   

The '* TSAN Enterprise RHEL 8.0 DEBUG' variant fails with a timeout. After initially looking at this (with Max) we suspect that there is either a crash or a deadlock, and TSAN is trying to generate a core and that fails because TSAN cores are a very large (14GB). 

Example evergreen run can be found here: https://spruce.mongodb.com/task/mongodb_mongo_master_enterprise_rhel80_debug_tsan_display_fcv_upgrade_downgrade_replica_sets_jscore_passthrough_patch_dcd8c050fb807cd6a30f1c3f833f4be23c22fdcf_64798230c9ec4406ad3ccb8a_23_06_02_05_46_27/execution-tasks?execution=0&sorts=STATUS%3AASC

At this time the Tsan variant will not run this yml, but after finding out why this issue occurs and we have a fix or workaround, then we should add back 'fcv_upgrade_downgrade_replica_sets_jscore_passthrough_gen' to variant "* TSAN Enterprise RHEL 8.0 DEBUG" in etc/evergreen.yml.



 Comments   
Comment by Adi Zaimi [ 07/Jun/23 ]

PN: Would need to add:
 - name: fcv_upgrade_downgrade_replica_sets_jscore_passthrough_gen                                                       
 - name: fcv_upgrade_downgrade_sharding_jscore_passthrough_gen                                                           
 - name: fcv_upgrade_downgrade_sharded_collections_jscore_passthrough_gen    

to the 'TSAN' variant section after found issue here.

Comment by Adi Zaimi [ 07/Jun/23 ]

Here is another run for same issue: https://spruce.mongodb.com/task/mongodb_mongo_master_enterprise_rhel80_debug_tsan_display_fcv_upgrade_downgrade_replica_sets_jscore_passthrough_patch_dcd8c050fb807cd6a30f1c3f833f4be23c22fdcf_64781a8c2a60ed8cd014ec59_23_06_01_04_11_59/execution-tasks?execution=0&sorts=STATUS%3AASC

Generated at Thu Feb 08 06:36:53 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.