[SERVER-5152] Windows unhandled exception filter should report thread and fault address Created: 01/Mar/12  Updated: 11/Jul/16  Resolved: 04/Mar/12

Status: Closed
Project: Core Server
Component/s: Internal Code, Logging
Affects Version/s: None
Fix Version/s: 2.1.1

Type: Improvement Priority: Major - P3
Reporter: Tad Marshall Assignee: Tad Marshall
Resolution: Done Votes: 0
Labels: Windows
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Windows


Backwards Compatibility: Fully Compatible
Participants:

 Description   

The Windows version of mongod sets up an "unhandled exception filter" which is called when an exception occurs that is not trapped by any of our regular try/catch code. The main reason it exists and the main way that it gets to execute is on "access violations", the Windows term for a segfault. Attempts to read from address 0 or 0 plus a structure offset will pass through this exception filter on their way to a quick exit. All the code does is record the fact that it happened.

But the existing code garbles its output ("unhandled Windows ex" is output followed by a timestamp and "access violation" with no newline so it looks really bad) and, worse, it doesn't tell us which thread had the access violation or what the faulting address was, so we have absolutely nothing to go on in debugging it.

The code should instead display a readable output line, make sure that it goes to the log file so we get it when mongod.exe is running as a service, and it should tell us which thread faulted and what the faulting address was. This would at least give us a starting point in finding out how a crash happened.

I am posting this because of an access violation that happened in buildbot for the 32-bit Windows version that was not reproducible when tested on my machine and which passed the test on the next buildbot run. So all we know is that the 32-bit Windows version can crash but nothing about how it can happen.



 Comments   
Comment by auto [ 04/Mar/12 ]

Author:

{u'login': u'tadmarshall', u'name': u'Tad Marshall', u'email': u'tad@10gen.com'}

Message: SERVER-5152 Improve unhandled exception code on Windows

This change improves the reporting of unhandled exceptions in
mongod.exe in Windows, mainly access violations (segfaults). Like
the old code, it doesn't try to keep running, but in release
builds this version does try to use dbexit() to exit instead of
letting Windows do it to us. In debug builds, we pass the exception
on to Windows as before, which lets it offer the chance to run the
debugger. Tested with a null pointer reference, we now get a clean
log with a thread name and the address where the exception happened.
Branch: master
https://github.com/mongodb/mongo/commit/ad7deaa8818c95f2c590232a8a73e71441413e21

Comment by Tad Marshall [ 03/Mar/12 ]

I edited the description to remove my incorrect claim about catch ( ... ) and set the Fix Version to 2.1.1 since the code is written and moving through code review. This will be very helpful for debugging access violations in the Windows version in the field.

Comment by Tad Marshall [ 01/Mar/12 ]

Good question, and what I said may be wrong. I have used __try { } __finally { } and noticed that C++ exceptions on Windows seem to use SEH, but I don't have actual practice trying to use SEH and C++ exceptions together so I am probably wrong.

Update ... I tested it and I am wrong.

    void Listener::_logListen( int port , bool ssl ) {
        log() << _name << ( _name.size() ? " " : "" ) << "waiting for connections on port " << port << ( ssl ? " ssl" : "" ) << endl;
        static bool uncrashed = true;
        try {
            if ( uncrashed ) {
                uncrashed = false;
                *( static_cast< char * >( 0 ) ) = 0;
            }
        }
        catch ( ... ) {
            log() << "caught exception, continuing ..."<< endl; // does not fire
        }
    }

Comment by Andy Schwerin [ 01/Mar/12 ]

Interesting, does (...) catch access violations if you don't have structured exception handling (SEH) enabled?

Generated at Thu Feb 08 03:08:03 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.