[SERVER-36230] Waits-for graph no longer being generated by hang_analyzer.py script Created: 20/Jul/18  Updated: 29/Oct/23  Resolved: 02/Aug/18

Status: Closed
Project: Core Server
Component/s: Testing Infrastructure
Affects Version/s: 4.1.1
Fix Version/s: 4.1.2

Type: Bug Priority: Critical - P2
Reporter: Siyuan Zhou Assignee: Max Hirschhorn
Resolution: Fixed Votes: 0
Labels: tig-hanganalyzer
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Problem/Incident
is caused by SERVER-36011 remove MMAPv1 support from the lock m... Closed
Related
related to SERVER-36356 Improve test coverage for hang analyz... Closed
Backwards Compatibility: Fully Compatible
Sprint: TIG 2018-08-13
Participants:
Linked BF Score: 0
Story Points: 2

 Description   

In BF-9986, we saw many threads waiting on DB locks, but the hang analyzer cannot generate the graph for us. The lock manager dumps the information though.

 [2018/07/18 12:40:44.821] warning: target file /proc/17238/cmdline contained unexpected null characters
 [2018/07/18 12:40:44.821] Saved corefile dump_mongod.17238.core
 [2018/07/18 12:41:07.364] Running Hang Analyzer Supplement - MongoDBDumpLocks
 [2018/07/18 12:41:07.364] Not generating the digraph, since the lock graph is empty
 [2018/07/18 12:41:07.364] Running Print JavaScript Stack Supplement
 [2018/07/18 12:41:07.364] Detaching from program: /data/mci/d8572849d2ad8ce1953117503927f065/src/mongod, process 17238
 [2018/07/18 12:41:07.491] Done analyzing mongod process with PID 17238
 [2018/07/18 12:41:07.491] Debugger /opt/mongodbtoolchain/gdb/bin/gdb, analyzing mongod process with PID 17241



 Comments   
Comment by Githook User [ 02/Aug/18 ]

Author:

{'username': 'visemet', 'name': 'Max Hirschhorn', 'email': 'max.hirschhorn@mongodb.com'}

Message: SERVER-36230 Handle non-templatized LockerImpl class in gdb scripts.
Branch: master
https://github.com/mongodb/mongo/commit/684d7fce79006ba7d18ddf44e67a2f1aec6f8fbb

Comment by Max Hirschhorn [ 20/Jul/18 ]

The mongodb-show-locks command is similarly not showing any output either. I also don't see any "Ignoring GDB error" messages that I believe the find_lock_manager_holders() function simply isn't finding them. The changes from c7bd92f as part of SERVER-36011 made it so the LockerImpl class is no longer a template. CC geert.bosch

The find_lock_manager_holders() had assumed that it is with the string it searches for in the frame and the concrete type it looks up.

def find_lock_manager_holders(graph, thread_dict, show):  # pylint: disable=too-many-locals
    """Find lock manager holders."""
    frame = find_frame(r'mongo::LockerImpl\<.*\>::')
    if not frame:
        return

Note: The issue isn't specific to Amazon Linux 2 so I'm changing the title of this ticket.

Generated at Thu Feb 08 04:42:28 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.