[SERVER-46008] logv2 shutdown use-after-dtor Created: 06/Feb/20  Updated: 29/Oct/23  Resolved: 27/Feb/20

Status: Closed
Project: Core Server
Component/s: Internal Code
Affects Version/s: None
Fix Version/s: 4.3.4

Type: Bug Priority: Major - P3
Reporter: Billy Donahue Assignee: Henrik Edin
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-45677 enable libunwind by default where ava... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Steps To Reproduce:

    $ gdb --args build/static_gcc_ninja/mongo/db/db_unittests --suite=DBRAIITestFixture --filter=AutoGetCollectionForReadGlobalLockDeadline --repeat=1

Sprint: Dev Tools 2020-03-09
Participants:

 Description   

 

 

 

2020-02-06T02:19:37.296-0500 I - [main] TOTALS | tests: 1 | fails: 0 | assert calls: 0 | time secs: 4.783
 2020-02-06T02:19:37.296-0500 I - [main] SUCCESS - All tests in all suites passed
 pure virtual method called
Thread 1 "db_unittests" received signal SIGSEGV, Segmentation fault.
 mongo::logv2::detail::doLogImpl (id=0, severity=..., options=..., message=..., attrs=...) at src/mongo/logv2/log_detail.cpp:51
 51 auto& source = options.domain().internal().source();
 (gdb) bt
 #0 mongo::logv2::detail::doLogImpl (id=0, severity=..., options=..., message=..., attrs=...) at src/mongo/logv2/log_detail.cpp:51
 #1 0x000055555b8983d6 in mongo::logv2::detail::doLog<char [3], mongo::StringData&> (id=0, severity=..., options=..., message=..., args#0=...)
 at src/mongo/logv2/log_detail.h:61
 #2 0x000055555b897f06 in mongo::logger::LogV2Appender<mongo::logger::MessageEventEphemeral>::append (this=0x7ffff065d9d0, event=...)
 at src/mongo/logger/logv2_appender.h:102
 #3 0x000055555b89d19c in mongo::logger::LogDomain<mongo::logger::MessageEventEphemeral>::append (this=0x7ffff7f96f48, event=...) at src/mongo/logger/log_domain-impl.h:56
 #4 0x000055555b8f3534 in mongo::(anonymous namespace)::writeMallocFreeStreamToLog () at src/mongo/util/signal_handlers_synchronous.cpp:174
 #5 0x000055555b8f3721 in mongo::(anonymous namespace)::myTerminate () at src/mongo/util/signal_handlers_synchronous.cpp:236
 #6 0x000055555bb7ad76 in _cxxabiv1::_terminate(void ()) () at ../../../../src/combined/libstdc+-v3/libsupc+/eh_terminate.cc:47
 #7 0x000055555bb7adb1 in std::terminate () at ../../../../src/combined/libstdc+-v3/libsupc+/eh_terminate.cc:57
 #8 0x000055555bb862df in _cxxabiv1::_cxa_pure_virtual () at ../../../../src/combined/libstdc+-v3/libsupc+/pure.cc:50
 #9 0x000055555b8dd8ae in mongo::Mutex::_onQuickLock (this=0x55555dd762e0 <mongo::globalRSMonitorManager>) at src/mongo/platform/mutex.cpp:102
 #10 0x000055555b8dd5c8 in mongo::Mutex::lock (this=0x55555dd762e0 <mongo::globalRSMonitorManager>) at src/mongo/platform/mutex.cpp:51
 #11 0x0000555558e58b56 in std::lock_guard<mongo::Latch>::lock_guard (this=0x7fffffffda20, __m=...)
 at /opt/mongodbtoolchain/stow/gcc-v3.sFn/include/c++/8.2.0/bits/std_mutex.h:162
 #12 0x000055555b0c9fa9 in mongo::ReplicaSetMonitorManager::shutdown (this=0x55555dd762e0 <mongo::globalRSMonitorManager>)
 at src/mongo/client/replica_set_monitor_manager.cpp:165
 #13 0x000055555b0c92cc in mongo::ReplicaSetMonitorManager::~ReplicaSetMonitorManager (this=0x55555dd762e0 <mongo::globalRSMonitorManager>, __in_chrg=<optimized out>)
 at src/mongo/client/replica_set_monitor_manager.cpp:69
 #14 0x00007ffff6030041 in _run_exit_handlers (status=0, listp=0x7ffff63d8718 <_exit_funcs>, run_list_atexit=run_list_atexit@entry=true, run_dtors=run_dtors@entry=true)
 at exit.c:108
 #15 0x00007ffff603013a in __GI_exit (status=<optimized out>) at exit.c:139
 #16 0x00007ffff600eb9e in __libc_start_main (main=0x555558fd0c02 <main(int, char*, char*)>, argc=4, argv=0x7fffffffdcb8, init=<optimized out>, fini=<optimized out>,
 rtld_fini=<optimized out>, stack_end=0x7fffffffdca8) at ../csu/libc-start.c:344
 #17 0x0000555558c488ea in _start () at ../../../../src/combined/libstdc+-v3/libsupc+/eh_terminate.cc:51
 (gdb)

 

 



 Comments   
Comment by Githook User [ 26/May/22 ]

Author:

{'name': 'Andrew Morrow', 'email': 'acm@mongodb.com', 'username': 'acmorrow'}

Message: SERVER-60440 Upgrade boost to 1.79.0

Comment by Githook User [ 13/May/21 ]

Author:

{'name': 'Andrew Morrow', 'email': 'acm@mongodb.com', 'username': 'acmorrow'}

Message: SERVER-44079 Reapply the fix for SERVER-46008 to boost 1.76.0

This reverts commit 6180e0a83a5a50534ebb09e1a075a774ba92c919.
Branch: master
https://github.com/mongodb/mongo/commit/68a54cbf96cb2ab7ff332552fd951701ea1cb95c

Comment by Githook User [ 07/May/21 ]

Author:

{'name': 'Andrew Morrow', 'email': 'acm@mongodb.com', 'username': 'acmorrow'}

Message: Revert "SERVER-44079 Reapply the fix for SERVER-46008 to boost 1.76.0"

This reverts commit b06cffeabf1d6152f3cf722efb07a144480d3917.
Branch: master
https://github.com/mongodb/mongo/commit/6180e0a83a5a50534ebb09e1a075a774ba92c919

Comment by Githook User [ 07/May/21 ]

Author:

{'name': 'Andrew Morrow', 'email': 'acm@mongodb.com', 'username': 'acmorrow'}

Message: SERVER-44079 Reapply the fix for SERVER-46008 to boost 1.76.0

The patch was generated with

`git format-patch -1 23a5160098996b401a1145504a7be356a3e6c8f1`

and then applied with

`patch p4 < 0001SERVER-46008-Remove-thread-specific-attributes-featu.patch`

Changes to the `SConstruct` were ignored as they did not need to be reapplied.
Branch: master
https://github.com/mongodb/mongo/commit/b06cffeabf1d6152f3cf722efb07a144480d3917

Comment by Githook User [ 26/Mar/20 ]

Author:

{'name': 'Henrik Edin', 'username': 'henrikedin', 'email': 'henrik.edin@mongodb.com'}

Message: SERVER-46008 Remove thread specific attributes feature from boost::log

Because we prefer to compile with BOOST_LOG_USE_COMPILER_TLS for performance reasons it is not safe to log during global shutdown because of the uses of thread_local.
We are not using this feature so we remove it instead of removing BOOST_LOG_USE_COMPILER_TLS and affecting the server.

The issue only affects unittests as we only do clean shutdown there.

(cherry picked from commit 23a5160098996b401a1145504a7be356a3e6c8f1)
Branch: v4.4
https://github.com/mongodb/mongo/commit/5514680bff53e8371b4db6a7a5cd73009530ab47

Comment by Githook User [ 06/Mar/20 ]

Author:

{'username': 'henrikedin', 'name': 'Henrik Edin', 'email': 'henrik.edin@mongodb.com'}

Message: SERVER-46008 Remove thread specific attributes feature from boost::log

Because we prefer to compile with BOOST_LOG_USE_COMPILER_TLS for performance reasons it is not safe to log during global shutdown because of the uses of thread_local.
We are not using this feature so we remove it instead of removing BOOST_LOG_USE_COMPILER_TLS and affecting the server.

The issue only affects unittests as we only do clean shutdown there.
Branch: master
https://github.com/mongodb/mongo/commit/23a5160098996b401a1145504a7be356a3e6c8f1

Comment by Githook User [ 26/Feb/20 ]

Author:

{'name': 'Henrik Edin', 'username': 'henrikedin', 'email': 'henrik.edin@mongodb.com'}

Message: SERVER-46008 Fix so logging works during global shutdown
Branch: master
https://github.com/mongodb/mongo/commit/95a345cc3f19a4762c68a40c7dd283ba0772ee11

Comment by Eric Milkie [ 22/Feb/20 ]

I am using a similar command line to build, except that I am using link-model dynamic.

Comment by Billy Donahue [ 22/Feb/20 ]

I just retried this today and the steps to reproduce no longer yield a failing db_unittest.

Maybe some destructors have been shuffled around or the underlying problem has been fixed.

I couldn't reproduce milkie's storage_wiredtiger_record_store_and_index_test failure either.

My SCons invocation, FTR:
/home/billy/dev/mongodb/mongo/venv3/bin/python3 buildscripts/scons.py VARIANT_DIR=static_gcc_ninja MONGO_VERSION=0.0.0 MONGO_GIT_HASH=unknown --dbg=on --opt=on ICECC=icecc VERBOSE=1 --modules= --ninja NINJA_PREFIX=static_gcc --variables-files=etc/scons/mongodbtoolchain_stable_gcc.vars --link-model=static generate-ninja

WIth master code @ rev 616564d1a5.

Comment by Eric Milkie [ 21/Feb/20 ]

I'm noticed this error today, just running storage_wiredtiger_record_store_and_index_test in debug mode. I suspect other tests also hit this.
The origin of the problem appears to be that the new Mutex stuff is hitting a pure virtual function call if you try to lock a Mutex during exit handling, possibly because some global state has already destructed.
In the handling of the pure virtual function call, the C++ runtime calls std::terminate. This tries to use logv2's boost::log to format and output a message, but it hits a segmentation fault, possibly due to similar global-destruction-order issues.

Comment by Sara Williamson [ 10/Feb/20 ]

acm we do have ASAN coverage for logv2.

Comment by Billy Donahue [ 06/Feb/20 ]

acm was observed by hand, while running a unit test under ordinary workstation build.

"pure virtual method called" is printed after the test succeeds, and the program self-terminates abnormally.

Comment by Andrew Morrow (Inactive) [ 06/Feb/20 ]

Did ASAN catch this, or do we not currently have ASAN coverage for logv2?

Generated at Thu Feb 08 05:10:16 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.