[SERVER-74682] Prevent FIPS tests from running on TSAN builders Created: 07/Mar/23 Updated: 29/Oct/23 Resolved: 29/Jun/23 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 7.1.0-rc0 |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | Varun Ravichandran | Assignee: | Gabriel Marks |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||
| Assigned Teams: |
Server Security
|
||||
| Backwards Compatibility: | Fully Compatible | ||||
| Sprint: | Security 2023-05-01, Security 2023-05-15, Security 2023-05-29, Security 2023-06-12 | ||||
| Participants: | |||||
| Linked BF Score: | 8 | ||||
| Description |
|
RHEL 8's FIPS-compliant libcrypto.so library has a possible deadlock that is getting caught by TSAN. The potential deadlock appears to manifest during SSL context initialization and is only triggered when FIPS mode is enabled. Since the calling code occurs within a MONGO_INITIALIZER, which executes in a single-threaded context, and the bug itself is in third-party library code, we can avoid these BFs by simply preventing FIPS tests from running on TSAN builders. |
| Comments |
| Comment by Githook User [ 03/Jul/23 ] |
|
Author: {'name': 'Gabriel Marks', 'email': 'gabriel.marks@mongodb.com', 'username': 'marksg07'}Message: |
| Comment by Githook User [ 29/Jun/23 ] |
|
Author: {'name': 'Gabriel Marks', 'email': 'gabriel.marks@mongodb.com', 'username': 'marksg07'}Message: |
| Comment by Gabriel Marks [ 27/Jun/23 ] |
|
alex.neben@mongodb.com, the two stack traces marked by FIPS as having deadlock-prone lock ordering are two separate initializers. As initializers are guaranteed to run single-threaded, this is 100% not an issue. I'm not sure exactly what TSAN is confused about here (perhaps it doesn't understand how our initializers work), but TSAN is not 100% false-positive-free, and I'm going to mark this down as a rare TSAN false positive (especially since we have not seen this in the wild, or have any evidence of hitting a deadlock here on non-TSAN builders). |
| Comment by Alex Neben [ 20/Apr/23 ] |
|
Your team should feel empowered to fix these tests however you want. However, I am skeptical of the logic here "Since the calling code occurs within a MONGO_INITIALIZER, which executes in a single-threaded context, and the bug itself is in third-party library code". The way TSAN works is it basically marks the memory as "race-able". It will unmark that memory when a mutex is acquired, a memory fence happens, etc... My guess is that while this is written to in a single-threaded context there is something that 99.99% will run after the initializers run but it is not guaranteed to run after the initializers. My suspicion is that this should instead be fixed by using a mutex or atomic to guard around openssl calls.
|