[SERVER-28861] Coverity analysis defect 100779: Thread deadlock Created: 19/Apr/17  Updated: 06/Dec/22  Resolved: 02/Jan/20

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Coverity Collector User Assignee: [DO NOT USE] Backlog - Sharding Team
Resolution: Won't Fix Votes: 0
Labels: coverity, neweng
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Sharding
Operating System: ALL
Sprint: Sharding 2017-10-02
Participants:

 Description   

Threads may try to acquire two locks in different orders, potentially causing deadlock

Defect 100779 (STATIC_C)
Checker ORDER_REVERSAL (subcategory none)
File: /src/mongo/db/logical_clock.cpp
Function mongo::LogicalClock::reserveTicks(unsigned long)



 Comments   
Comment by Sheeri Cabral (Inactive) [ 02/Jan/20 ]

This is a valid bug but the mock is used in single-threaded unit tests, and not seen in the wild or in testing.

Comment by Misha Tyulenev [ 21/Nov/17 ]

Moving to Backlog as this is an issue that is not possible for production code because the BackgroundThreadClockSource::now() does not use mutexes its an atomic.

The suggested fix should change the ClockSourceMock::_now to an atomic implementation so it will not need mutexes.

Comment by Spencer Brody (Inactive) [ 12/Jul/17 ]

Assigning to sharding team since this involves the LogicalClock

Comment by William Schultz (Inactive) [ 06/Jul/17 ]

The issue Coverity picked up on has to deal with inconsistent lock acquisition order, involving the LogicalClock::_mutex and the NetworkInterfaceMock::_mutex, leading to a potential deadlock. Consider the following two functions, and the order that they acquire locks:

LogicalClock::reserveTicks
  1. Acquire LogicalClock::_mutex (code )
  2. Acquire NetworkInterfaceMock::_mutex (code )
NetworkInterfaceMock::_scheduleResponse
  1. Acquire NetworkInterfaceMock::_mutex (code )
  2. Acquire LogicalClock::_mutex (code , readReplyFromMetadata )

Since these two functions acquire the same two locks but in reverse orders, it creates the potential for a deadlock, if each of these functions are running on separate threads and try to acquire the locks with a specific thread interleaving. I presume that the NetworkInterfaceMock would only be used for testing, and so is not a serious issue.

Generated at Thu Feb 08 04:19:16 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.