[SERVER-46682] Reuse debugger process for processes of same type in hang_analyzer.py Created: 06/Mar/20  Updated: 29/Oct/23  Resolved: 02/Jun/20

Status: Closed
Project: Core Server
Component/s: Testing Infrastructure
Affects Version/s: None
Fix Version/s: 4.7.0

Type: Improvement Priority: Major - P3
Reporter: Vlad Rachev (Inactive) Assignee: Vlad Rachev (Inactive)
Resolution: Fixed Votes: 0
Labels: tig-hanganalyzer
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Gantt Dependency
has to be done after SERVER-46693 Parallelize debugger processes in han... Closed
Related
is related to SERVER-48479 hang-analyzer for macos doesn't work ... Closed
is related to SERVER-48590 QOL improvements for hang-analyzer Closed
Backwards Compatibility: Fully Compatible
Sprint: STM 2020-05-18, STM 2020-06-01, STM 2020-06-15
Participants:
Story Points: 3

 Description   

Reloading the symbols for every process is another bottleneck. To alleviate this, hang_analyzer.py will be modified to reuse the same debugger process and analyze all processes of the same type (ex. All mongod processes will be analyzed in the same debugger process).

  • Processes will be grouped by process type (Ex. all mongod processes)
  • A single process will be created that will:

    run debugger
    load symbols
    for process in processes:
        attach process
        dump info

The debugger scripts are all hardcoded strings, the script for GDB is especially ugly. GDB has an API for python, so if this change turns out to be non-trivial to hardcode as plaintext, we can consider rewriting it to use the python API.

As part of this ticket, ensure the performance improves.



 Comments   
Comment by Vlad Rachev (Inactive) [ 04/Jun/20 ]

commit: https://github.com/mongodb/mongo/commit/9fcca8acb9a8995e007b5c4c06e5349a57e274e6

Comment by Vlad Rachev (Inactive) [ 10/Mar/20 ]

Move the gdb/lldb commands out of hang_analyzer.py and into their own python functions, so that engineers can load those functions into gdb/lldb without needing to go through the hang-analyzer.

As part of this we will add some testing to hang_analyzer.py. One thing we can do is check that when we call dbg.dump_info with some pids, the command sent to the debugger includes those pids.

Generated at Thu Feb 08 05:12:09 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.