[SERVER-12259] startup option to provide core dumps Created: 06/Jan/14  Updated: 25/Jan/17  Resolved: 07/Jul/15

Status: Closed
Project: Core Server
Component/s: Admin
Affects Version/s: None
Fix Version/s: 3.1.6

Type: New Feature Priority: Major - P3
Reporter: Bruce Lucas (Inactive) Assignee: Andy Schwerin
Resolution: Done Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-18474 provide core dump on fassert and inva... Closed
related to SERVER-19699 Save diagnostic files on failure - Wi... Closed
is related to SERVER-19230 WT seg fault on pure read work load Closed
Backwards Compatibility: Minor Change
Sprint: Sharding 6 07/17/15
Participants:

 Description   

On occasion it is necessary to get a core dump from a customer, for example to debug a difficult-to-reproduce segfault. The only option currently is to ask the customer to recompile mongod with the sigaction call(s) in db.cpp commented out, which may not be an appealing option for many customers. It would be a useful serviceability feature to have a startup option that disables those sigaction calls (or otherwise enables core dumps) for at least SIGSEGV, SIGBUS, SIGILL, and SIGFPE, and possibly SIGABRT.



 Comments   
Comment by Charlie Page [ 07/Jul/15 ]

3.0 backport would be ideal.

Comment by Andy Schwerin [ 07/Jul/15 ]

Should we backport this to 3.0?

Comment by Githook User [ 07/Jul/15 ]

Author:

{u'username': u'andy10gen', u'name': u'Andy Schwerin', u'email': u'schwerin@mongodb.com'}

Message: SERVER-12259 Make custom signal handlers defer to default behavior rather than calling quickExit.

On systems with good support for POSIX signals, by configuring the MongoDB
custom signal handlers to automatically reset to the default handler when
executing, and to re-raise fatal signals at the end of the custom handler
instead of calling quickExit, we can get proper OS behavior for core dumping and
exiting with signal codes.
Branch: master
https://github.com/mongodb/mongo/commit/69294c07d62d61f631f974ee852a41ad7087d19b

Comment by Bruce Lucas (Inactive) [ 07/Jul/15 ]

In my opinion simply deferring to o/s settings regarding core files is the best thing to do. Not just ulimit but also kernel.core_pattern, abortd or equivalent, etc. play a role.

Comment by Charlie Page [ 06/Jul/15 ]

On by default and delegate to the OS for consistency on how we handle other settings. Then we can update the production notes about dumping core.

Comment by Andy Schwerin [ 06/Jul/15 ]

For reference, here's the core(5) man page.

Comment by Andy Schwerin [ 06/Jul/15 ]

I'm working on a patch that causes our signal handlers to delegate to the system default handler for each signal after printing the extra information we want to the logs. It's pretty straightforward, and has the added advantage that when the mongo process dies with a signal, the parent process can learn what signal via the WIFSIGNALED/WTERMSIG macros on the result of waitpid(). My patch causes the signals that normally produce core dumps to produce core dumps, when ulimit etc allow. This includes anything that calls std::terminate, such as uncaught exceptions.

My question is, do we want this behavior on by default? The user would still have to enable core dumps by appropriately configuring ulimit. If you're curious, MMAPv1 data mappings won't be dumped unless the user changes /proc/PID/coredump_filter bitmask.

Comment by Michael Cahill (Inactive) [ 06/Jul/15 ]

martin.bligh, dan@10gen.com this came up again in SERVER-19230 – we're going to have to close that because we can't reproduce it and can't get a core dump to dig deeper. I think all we're talking about here is a (possibly undocumented) way to avoid calling setupSignalHandlers.

Can we get this scheduled?

Comment by Eric Milkie [ 08/Jan/14 ]

This should include equivalent functionality for minidumps on Windows.

Generated at Thu Feb 08 03:28:03 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.