[SERVER-37387] mongo_lock.py graph displays lock request type for LockManager locks Created: 28/Sep/18  Updated: 29/Oct/23  Resolved: 10/Oct/18

Status: Closed
Project: Core Server
Component/s: Testing Infrastructure
Affects Version/s: None
Fix Version/s: 4.1.4

Type: Bug Priority: Major - P3
Reporter: Jonathan Abrahams Assignee: Max Hirschhorn
Resolution: Fixed Votes: 0
Labels: tig-hanganalyzer
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to SERVER-34738 mongo_lock.py graph should display lo... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: STM 2018-10-22
Participants:
Story Points: 1

 Description   

Changes for SERVER-34738 now display the lock requester's mode. We need to display the lock holder's mode as well.

We should change the following

            print("MongoDB Lock at {} ({}) held by {} waited on by {}".format(
                lock_head, lock_request["mode"], lock_holder, lock_waiter))

to

           # Code to set the lock_mode needs to be added
            print("MongoDB Lock at {} ({}) held by {} waited on by {} ({})".format(
                lock_head, lock_mode, lock_holder, lock_waiter, lock_request["mode"]))

Other references to lock_request["mode"] should be modified as well, such that the mode is properly associated to holder.



 Comments   
Comment by Githook User [ 10/Oct/18 ]

Author:

{'name': 'Max Hirschhorn', 'email': 'max.hirschhorn@mongodb.com', 'username': 'visemet'}

Message: SERVER-37387 Clarify how lock mode displayed refers to mode held.
Branch: master
https://github.com/mongodb/mongo/commit/e01cdb673978f6d3a9d53bb3a6ba35b34c281b0e

Comment by Esha Maharishi (Inactive) [ 10/Oct/18 ]

The mode argument to the LockerImpl::lockComplete() function that the waiter is blocked in would be the mode for the pending request. It tends to be optimized out in the LockerImpl::lockComplete() frame that we'd be unlikely to include it in the output anyway.

As I mentioned in my comment above, information about the mode for the pending request would be challenging to try and get to appear in the waits-for graph because it tends to be optimized out of the frame we're examining. It is however usually recorded in an earlier stack frame or can be deduced from the C++ code in the stacktrace.

I see. My confusion came from BF-10750, where there is no mention of an X lock request in the waits-for graph, though the deadlock was a result of an enqueued (though not held) X lock. You are right though that in this case, the thread that has enqueued the X lock request will always appear in the stack traces.

Maybe the "legend" at the top of the .gv file could be clarified?

It currently says:

# Legend:
#    Thread 1 -> Lock 1 indicates Thread 1 is waiting on Lock 1
#    Lock 2 -> Thread 2 indicates Lock 2 is held by Thread 2

Here's an example of the two types of lines:

  ""conn419" (Thread 0x7f2685aa6700 (LWP 13733))" -> "Lock 0x7f26c0334780 (mongo::MODE_IS)";
  "Lock 0x7f26c0334780 (mongo::MODE_IS)" -> ""conn288" (Thread 0x7f268c2e4700 (LWP 31650))";

The first line is unclear; does it mean:

"Thread 1 is waiting to acquire Lock 1 in MODE_IS, but Lock 1 is held in a conflicting mode"

or

"Thread 1 is waiting to acquire Lock 1 in some mode which conflicts with the mode Lock 1 is currently held in, which is MODE_IS"

 

I do assume the second line means "Lock 2 is held by Thread 2 in MODE_IS."

Comment by Max Hirschhorn [ 10/Oct/18 ]

esha.maharishi, since I believe this ticket was file as a result of some confusion around seeing threads waiting on a lock held in MODE_IX, is there something I could do to help clarify the language being written to the task logs and used for the labels in the waits-for graph? The behavior you observed is expected when a MODE_X lock request is pending as how the lock manager ensures fairness / avoids starvation. As I mentioned in my comment above, information about the mode for the pending request would be challenging to try and get to appear in the waits-for graph because it tends to be optimized out of the frame we're examining. It is however usually recorded in an earlier stack frame or can be deduced from the C++ code in the stacktrace. The fully-detailed table of granted lock requests and pending lock requests would also be logged by the mongod process and would be visible in the test logs.

Comment by Max Hirschhorn [ 10/Oct/18 ]

Changes for SERVER-34738 now display the lock requester's mode. We need to display the lock holder's mode as well.

Other references to lock_request["mode"] should be modified as well, such that the mode is properly associated to holder.

jonathan.abrahams, the find_lock_manager_holders() function in mongo_lock.py iterates through the LockHead::grantedList of any LockerImpl where there's a pending lock request. It determines that there's a pending lock request by the thread being blocked in a function on LockerImpl. lock_request["mode"] therefore already represents the mode in which the lock is held. The mode argument to the LockerImpl::lockComplete() function that the waiter is blocked in would be the mode for the pending request. It tends to be optimized out in the LockerImpl::lockComplete() frame that we'd be unlikely to include it in the output anyway.

[2018/10/05 00:55:18.538] #7  0x00007f8c73000334 in mongo::LockerImpl::lockComplete (this=0x7f8c79d8e680, opCtx=0x0, resId=..., mode=<optimized out>, deadline=..., checkDeadlock=false) at src/mongo/db/concurrency/lock_state.cpp:853
[2018/10/05 00:55:18.538] #8  0x00007f8c72ff1061 in mongo::Lock::ResourceLock::lock (this=0x7f8c47c95d30, mode=<optimized out>) at src/mongo/db/concurrency/d_concurrency.cpp:323
[2018/10/05 00:55:18.538] #9  0x00007f8c72ff1131 in mongo::Lock::GlobalLock::_enqueue (this=0x7f8c47c95d20, lockMode=mongo::MODE_IS, deadline=...) at src/mongo/db/concurrency/d_concurrency.cpp:173
[2018/10/05 00:55:18.539] #10 0x00007f8c72ff11f2 in mongo::Lock::GlobalLock::GlobalLock (this=0x7f8c47c95d20, opCtx=<optimized out>, lockMode=mongo::MODE_IS, deadline=..., behavior=<optimized out>, enqueueOnly=...) at src/mongo/db/concurrency/d_concurrency.cpp:157

My understanding is that James Wahlin wanted to know the mode the lock is currently held in and that has been achieved in SERVER-34738 already. Not including the mode for the pending request means that call ('mongo::(anonymous namespace)::globalLockManager').dump() continues to yield more information than the find_lock_manager_holders() output, but I think I'd rather leave addressing that gap to SERVER-28437.

Generated at Thu Feb 08 04:45:50 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.