[SERVER-28824] Unhandled hang analyzer exception escapes loop to get threads from each process Created: 17/Apr/17  Updated: 12/Oct/17  Resolved: 08/Aug/17

Status: Closed
Project: Core Server
Component/s: Testing Infrastructure
Affects Version/s: None
Fix Version/s: 3.4.9, 3.5.12

Type: Bug Priority: Major - P3
Reporter: Daniel Gottlieb (Inactive) Assignee: Jonathan Abrahams
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Related
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v3.4
Sprint: TIG 2017-06-19, TIG 2017-08-21
Participants:
Linked BF Score: 0

 Description   

Happened on Solaris as part of an Evergreen task timeout:

[2017/04/14 18:04:33.677] terminate called after throwing an instance of 'gdb_exception_RETURN_MASK_ERROR'
[2017/04/14 18:04:34.794]   200  Thread 166
[2017/04/14 18:04:35.043] Bad exit code -6
[2017/04/14 18:04:35.043] Traceback (most recent call last):
[2017/04/14 18:04:35.043]   File "buildscripts/hang_analyzer.py", line 732, in <module>
[2017/04/14 18:04:35.044]     main()
[2017/04/14 18:04:35.044]   File "buildscripts/hang_analyzer.py", line 700, in main
[2017/04/14 18:04:35.044]     options.dump_core and check_dump_quota(max_dump_size_bytes, dbg.get_dump_ext()))
[2017/04/14 18:04:35.044]   File "buildscripts/hang_analyzer.py", line 381, in dump_info
[2017/04/14 18:04:35.044]     logger)
[2017/04/14 18:04:35.044]   File "buildscripts/hang_analyzer.py", line 56, in call
[2017/04/14 18:04:35.044]     raise Exception()
[2017/04/14 18:04:35.044] Exception
[2017/04/14 18:04:35.084] Command failed: exit status 1



 Comments   
Comment by Githook User [ 14/Aug/17 ]

Author:

{'name': 'Jonathan Abrahams', 'username': 'hptabster', 'email': 'jonathan@mongodb.com'}

Message: SERVER-28824 Trap debugger excptions in hang analyzer and display at end for all processes

(cherry picked from commit 324839c0c0c2b294a44d130105797ccdbb3b17a9)
Branch: v3.4
https://github.com/mongodb/mongo/commit/01f3279e039f610a99ef84238ca49498c716ee14

Comment by Githook User [ 08/Aug/17 ]

Author:

{'username': 'hptabster', 'email': 'jonathan@mongodb.com', 'name': 'Jonathan Abrahams'}

Message: SERVER-28824 Trap debugger excptions in hang analyzer and display at end for all processes
Branch: master
https://github.com/mongodb/mongo/commit/324839c0c0c2b294a44d130105797ccdbb3b17a9

Comment by Jonathan Abrahams [ 04/Aug/17 ]

It makes sense to trap the exceptions when invoking the debugger and hold them until the end.

Comment by Max Hirschhorn [ 13/Jul/17 ]

jonathan.abrahams, I think we should either ignore the a non-zero return code from the debugger or defer raising an exception until the debugger has finished attaching to all processes. This behavior is preventing jstack from running against both Java VM processes when a Jepsen task times out.

Comment by Jonathan Abrahams [ 13/Jun/17 ]

This was fixed by the upgrade to GDB 7.12.1.

Comment by Mark Benvenuto [ 27/Apr/17 ]

GDB 7.12.1 contains the fix.

From https://sourceware.org/gdb/download/ANNOUNCEMENT:

GDB 7.12.1 brings the following fixes and enhancements over GDB 7.12:
 
   * PR tdep/20682 (aarch64 regression: gdb.cp/nextoverthrow.exp)
   * PR server/20733 (Failed to build aarch64_be-linux-gnu GDBserver)
   * PR tdep/20953 (GDB crashes after "set architecture rl78")
   * PR tdep/20954 (GDB crashes if "set architecture rx")
   * PR tdep/20955 (GDB internal error in cris-tdep.c)
   * PR build/20712 (gdb 7.12+ doesn't build as C++ on Solaris)
   * PR breakpoint/20653 (string_to_explicit_location has some weird code)
   * PR build/20753 (MinGW compilation errors due to strcasecmp)
   * PR gdb/20977 (GDB exception handling is broken on i686-w64-mingw32)
   * PR python/21048 (backtrace is broken on i686)
   * PR sim/20808 (mips sim build fails due to undefined SD/CPU variables)
   * PR sim/20809 (mips sim build fails for r3900 cpus)
   * PR gdb/20939 (GDB aborts if there is an error in disassembly)

Comment by Max Hirschhorn [ 27/Apr/17 ]

mark.benvenuto, is there a way to tell if the version of GDB 7.12 in the MongoDB toolchain contains the fix from https://sourceware.org/bugzilla/show_bug.cgi?id=20939?

Generated at Thu Feb 08 04:19:10 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.