Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-77887

Investigate timeout in running fcv upgrade/downgrade test for TSAN variant

    • Type: Icon: Bug Bug
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Labels:
    • Cluster Scalability
    • Sharding NYC 2023-06-12, Sharding NYC 2023-06-26, Sharding NYC 2023-07-10, Sharding NYC 2023-07-24, Sharding NYC 2023-08-07, Sharding NYC 2023-08-21, Sharding NYC 2023-09-04, Sharding NYC 2023-09-18, Sharding NYC 2023-10-02, Sharding NYC 2023-10-16

      The '* TSAN Enterprise RHEL 8.0 DEBUG' variant fails with a timeout. After initially looking at this (with Max) we suspect that there is either a crash or a deadlock, and TSAN is trying to generate a core and that fails because TSAN cores are a very large (14GB). 

      Example evergreen run can be found here: https://spruce.mongodb.com/task/mongodb_mongo_master_enterprise_rhel80_debug_tsan_display_fcv_upgrade_downgrade_replica_sets_jscore_passthrough_patch_dcd8c050fb807cd6a30f1c3f833f4be23c22fdcf_64798230c9ec4406ad3ccb8a_23_06_02_05_46_27/execution-tasks?execution=0&sorts=STATUS%3AASC

      At this time the Tsan variant will not run this yml, but after finding out why this issue occurs and we have a fix or workaround, then we should add back 'fcv_upgrade_downgrade_replica_sets_jscore_passthrough_gen' to variant "* TSAN Enterprise RHEL 8.0 DEBUG" in etc/evergreen.yml.

            backlog-server-cluster-scalability [DO NOT USE] Backlog - Cluster Scalability
            adi.zaimi@mongodb.com Adi Zaimi
            0 Vote for this issue
            1 Start watching this issue