[SERVER-11502] misplaced openssl callback registration can cause crashes Created: 31/Oct/13  Updated: 11/Jul/16  Resolved: 31/Oct/13

Status: Closed
Project: Core Server
Component/s: Networking
Affects Version/s: 2.4.6, 2.4.7, 2.5.3
Fix Version/s: 2.4.9, 2.5.4

Type: Bug Priority: Critical - P2
Reporter: Bruce Lucas (Inactive) Assignee: Andreas Nilsson
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
Backwards Compatibility: Fully Compatible
Operating System: Linux
Participants:

 Description   
Issue Status as of December 30th, 2013

ISSUE SUMMARY
Users can see a rare, intermittent server crash due to a race condition in the OpenSSL interface. This will only impact users that are running with SSL enabled. The crash can manifest in several ways, but the most common signature is to see a segmentation fault (signal 11) or an abort (signal 6) reported in the mongod logs, with a backtrace that includes references to libcrypto.

USER IMPACT
When using an earlier version than OpenSSL 1.x the server exhibits random, intermittent crashes in the OpenSSL interface.

SOLUTION
The crashes were due to multiple registration and unregistrations of an OpenSSL callback function. The registration is now performed only once and the callback is never unregistered.

WORKAROUNDS
Upgrade to OpenSSL 1.x.

PATCHES
Production release v2.4.9 contains the fix for this issue, and production release v2.6.0 will contain the fix as well.

Original Description

The calls to register the openssl callbacks in SSLThreadInfo() and unregister the callback in ~SSLThreadInfo() are misplaced: the callback is a static global, whereas SSLThreadInfo objects are per-thread. The callbacks should be registered once early (before any possible SSL activity) and do not ever need to be unregistered (or at least should not be unregistered on every thread exit, which has been shown to cause crashes due to duplicate frees). Removing the unregister in the destructor addresses the second point and solves the immediate problem, but there may be latent issues due to not registering the callback until the first SSLThreadInfo object is constructed, so I think probably the callback registration should be moved somewhere else as well.



 Comments   
Comment by auto [ 06/Nov/13 ]

Author:

{u'username': u'agralius', u'name': u'Andreas Nilsson', u'email': u'andreas.nilsson@10gen.com'}

Message: SERVER-11502 Moved OpenSSL multithreading callbacks
Branch: v2.4
https://github.com/mongodb/mongo/commit/5779b6e198c0dd22a99e12837faea4b5e8b2664f

Comment by Bruce Lucas (Inactive) [ 31/Oct/13 ]

Looks good - test has run for 20 min and ~500k connections without mishap, vs seconds and a few thousand connections to hit the issue before the fix.

Comment by Andreas Nilsson [ 31/Oct/13 ]

bruce.lucas@10gen.com can you verify that this is fixed on master now before we backport.

Comment by auto [ 31/Oct/13 ]

Author:

{u'username': u'agralius', u'name': u'Andreas Nilsson', u'email': u'andreas.nilsson@10gen.com'}

Message: SERVER-11502 Moved OpenSSL multithreading callbacks
Branch: master
https://github.com/mongodb/mongo/commit/eab2644c221206c121ac1ab93fcf95c8100f4ff3

Generated at Thu Feb 08 03:25:57 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.