[SERVER-74682] Prevent FIPS tests from running on TSAN builders Created: 07/Mar/23  Updated: 29/Oct/23  Resolved: 29/Jun/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 7.1.0-rc0

Type: Task Priority: Major - P3
Reporter: Varun Ravichandran Assignee: Gabriel Marks
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Assigned Teams:
Server Security
Backwards Compatibility: Fully Compatible
Sprint: Security 2023-05-01, Security 2023-05-15, Security 2023-05-29, Security 2023-06-12
Participants:
Linked BF Score: 8

 Description   

RHEL 8's FIPS-compliant libcrypto.so library has a possible deadlock that is getting caught by TSAN. The potential deadlock appears to manifest during SSL context initialization and is only triggered when FIPS mode is enabled. Since the calling code occurs within a MONGO_INITIALIZER, which executes in a single-threaded context, and the bug itself is in third-party library code, we can avoid these BFs by simply preventing FIPS tests from running on TSAN builders.



 Comments   
Comment by Githook User [ 03/Jul/23 ]

Author:

{'name': 'Gabriel Marks', 'email': 'gabriel.marks@mongodb.com', 'username': 'marksg07'}

Message: SERVER-74682 Add tsan_incompatible tag to block FIPS tests
Branch: EVG-17874-taskgen-test
https://github.com/mongodb/mongo/commit/7c10f0fc6dea8e9ec9149cc9f7c9e9bfda15e124

Comment by Githook User [ 29/Jun/23 ]

Author:

{'name': 'Gabriel Marks', 'email': 'gabriel.marks@mongodb.com', 'username': 'marksg07'}

Message: SERVER-74682 Add tsan_incompatible tag to block FIPS tests
Branch: master
https://github.com/mongodb/mongo/commit/7c10f0fc6dea8e9ec9149cc9f7c9e9bfda15e124

Comment by Gabriel Marks [ 27/Jun/23 ]

alex.neben@mongodb.com, the two stack traces marked by FIPS as having deadlock-prone lock ordering are two separate initializers. As initializers are guaranteed to run single-threaded, this is 100% not an issue. I'm not sure exactly what TSAN is confused about here (perhaps it doesn't understand how our initializers work), but TSAN is not 100% false-positive-free, and I'm going to mark this down as a rare TSAN false positive (especially since we have not seen this in the wild, or have any evidence of hitting a deadlock here on non-TSAN builders).

Comment by Alex Neben [ 20/Apr/23 ]

Your team should feel empowered to fix these tests however you want. However, I am skeptical of the logic here "Since the calling code occurs within a MONGO_INITIALIZER, which executes in a single-threaded context, and the bug itself is in third-party library code". The way TSAN works is it basically marks the memory as "race-able". It will unmark that memory when a mutex is acquired, a memory fence happens, etc... My guess is that while this is written to in a single-threaded context there is something that 99.99% will run after the initializers run but it is not guaranteed to run after the initializers. My suspicion is that this should instead be fixed by using a mutex or atomic to guard around openssl calls.

 

cc varun.ravichandran@mongodb.com 

Generated at Thu Feb 08 06:28:09 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.