[SERVER-50924] Suppress TSAN for LockerImpl::getLockerInfo() Created: 14/Sep/20  Updated: 29/Oct/23  Resolved: 18/Sep/20

Status: Closed
Project: Core Server
Component/s: Concurrency
Affects Version/s: None
Fix Version/s: 4.8.0

Type: Bug Priority: Major - P3
Reporter: Gregory Wlodarek Assignee: Gregory Wlodarek
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Related
is related to SERVER-51053 Create a new macro for suppressing TS... Backlog
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Execution Team 2020-10-05
Participants:
Linked BF Score: 11

 Description   

The thread sanitizer build variant has caught a couple of data race scenarios that involve the Locker::getLockerInfo() method. So far, these data races involve diagnostic commands such as currentOp or FTDC.

  1. The first data race in LockerImpl::getLockerInfo() involves iterating over the _requests map and accessing the lock requests mode here. The problem here is that we insert the lock request into the map here with mode MODE_NONE. Now, this lock request is accessible to the map iterator in LockerImpl::getLockerInfo(), but later on, as we're still acquiring the lock we change the lock requests mode here without any synchronization.
  2. The second data race involving LockerImpl::getLockerInfo() involves copying the locker info stats here. While the copy is happening via this method, a new global lock acquisition can come in and increment the statistics, calling this. This is problematic because there is no synchronization between incrementing the counter and copying its value.

Based on these findings, it would be wise to audit LockerImpl::getLockerInfo() for any other data races the thread sanitizer may not have discovered yet.



 Comments   
Comment by Githook User [ 18/Sep/20 ]

Author:

{'name': 'Gregory Wlodarek', 'email': 'gregory.wlodarek@mongodb.com', 'username': 'GWlodarek'}

Message: SERVER-50924 Suppress TSAN for LockerImpl::getLockerInfo()
Branch: master
https://github.com/mongodb/mongo/commit/52b5d6db7f947e09267efe08346cdd6bb54fe0ff

Comment by Gregory Wlodarek [ 14/Sep/20 ]

That's a fair point, bruce.lucas, we can also use this ticket to determine if it's better to suppress the TSAN warnings for this function.

At worst, the counters could underreport the true value for lock acquisition statistics and currentOp may list lock acquisitions with MODE_NONE while they're still being acquired early on in the lock stage.

Comment by Bruce Lucas (Inactive) [ 14/Sep/20 ]

For diagnostic purposes, depending on the nature and consequence of the data races, I wonder if it may not be worth the cost to fix it.

Generated at Thu Feb 08 05:24:01 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.