[SERVER-58194] Fix undefined behavior in the destructor for OpMsgFuzzerFixture Created: 01/Jul/21  Updated: 29/Oct/23  Resolved: 27/Oct/21

Status: Closed
Project: Core Server
Component/s: Internal Code
Affects Version/s: None
Fix Version/s: 5.1.0-rc3

Type: Bug Priority: Major - P3
Reporter: Amirsaman Memaripour Assignee: Amirsaman Memaripour
Resolution: Fixed Votes: 0
Labels: servicearch-wfbf-day
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Problem/Incident
causes SERVER-66464 use LLVMFuzzerRunDriver for OpMsgFuzz... Open
Related
related to SERVER-66455 Undo ~OpMsgFuzzerFixture fix Closed
is related to SERVER-66456 remove ThreadContext Closed
is related to SERVER-66385 Remove ThreadName's dependence on Thr... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Steps To Reproduce:

The test consistently fails on {A,UB}SAN Enterprise Ubuntu 18.04 FUZZER for any commit post SERVER-56351

Sprint: Service Arch 2021-11-01
Participants:
Linked BF Score: 50
Story Points: 4

 Description   

SERVER-56351 introduced a destructor for OpMsgFuzzerFixture that uses ClientStrand to bind the test client to the current thread when executed (here):

OpMsgFuzzerFixture::~OpMsgFuzzerFixture() {
    CollectionShardingStateFactory::clear(_serviceContext);
 
    {
        auto clientGuard = _clientStrand->bind();
        auto opCtx = _serviceContext->makeOperationContext(clientGuard.get());
        Lock::GlobalLock glk(opCtx.get(), MODE_X);
        auto databaseHolder = DatabaseHolder::get(opCtx.get());
        databaseHolder->closeAll(opCtx.get());
    }
 
    shutdownGlobalStorageEngineCleanly(_serviceContext);
}

Binding a client strand internally calls into _setCurrent that in turn calls into ThreadName::set to set the current thread's name:

void ClientStrand::_setCurrent() noexcept {
    invariant(_isBound.load());
    invariant(_client);
 
    LOGV2_DEBUG(
        5127801, kDiagnosticLogLevel, "Setting the Client", "client"_attr = _client->desc());
 
    // Set the Client for this thread so calls to Client::getCurrent() works as expected.
    Client::setCurrent(std::move(_client));
 
    // Set up the thread name.
    _oldThreadName = ThreadName::set(ThreadContext::get(), _threadName);
    if (_oldThreadName) {
        LOGV2_DEBUG(5127802, kDiagnosticLogLevel, "Set thread name", "name"_attr = *_threadName);
    }
}

During termination of the test, however, it appears that dereferencing the thread local to get ThreadContext::get() returns nullptr:

[cpp_libfuzzer_test:op_msg_fuzzer] SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior src/mongo/util/decorable.h:84:28 in
[cpp_libfuzzer_test:op_msg_fuzzer] AddressSanitizer:DEADLYSIGNAL
[cpp_libfuzzer_test:op_msg_fuzzer] =================================================================
[cpp_libfuzzer_test:op_msg_fuzzer] ==12365==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000018 (pc 0x55aea54da19e bp 0x7ffca099be90 sp 0x7ffca099be70 T0)
[cpp_libfuzzer_test:op_msg_fuzzer] ==12365==The signal is caused by a READ memory access.
[cpp_libfuzzer_test:op_msg_fuzzer] ==12365==Hint: address points to the zero page.
[cpp_libfuzzer_test:op_msg_fuzzer]     #0 0x55aea54da19d in std::__uniq_ptr_impl<unsigned char, std::default_delete<unsigned char []> >::_M_ptr() const /opt/mongodbtoolchain/revisions/0d5a071f1db663c050a1d7f330c13f46e62d6d4f/stow/gcc-v3.ulc/lib/gcc/x86_64-mongodb-linux/8.5.0/../../../../include/c++/8.5.0/bits/unique_ptr.h:150:42
[cpp_libfuzzer_test:op_msg_fuzzer]     #1 0x55aeb36890a3 in std::unique_ptr<unsigned char [], std::default_delete<unsigned char []> >::get() const /opt/mongodbtoolchain/revisions/0d5a071f1db663c050a1d7f330c13f46e62d6d4f/stow/gcc-v3.ulc/lib/gcc/x86_64-mongodb-linux/8.5.0/../../../../include/c++/8.5.0/bits/unique_ptr.h:598:21
[cpp_libfuzzer_test:op_msg_fuzzer]     #2 0x55aeb36890a3 in mongo::DecorationContainer<mongo::ThreadContext>::getDecoration(mongo::DecorationContainer<mongo::ThreadContext>::DecorationDescriptor) /data/mci/cf4c5d480d66775786e7d4ad6010fa16/src/src/mongo/util/decoration_container.h:168
[cpp_libfuzzer_test:op_msg_fuzzer]     #3 0x55aeb367f3e8 in mongo::(anonymous namespace)::ThreadNameSconce& mongo::DecorationContainer<mongo::ThreadContext>::getDecoration<mongo::(anonymous namespace)::ThreadNameSconce>(mongo::DecorationContainer<mongo::ThreadContext>::DecorationDescriptorWithType<mongo::(anonymous namespace)::ThreadNameSconce>) /data/mci/cf4c5d480d66775786e7d4ad6010fa16/src/src/mongo/util/decoration_container.h:183:33
[cpp_libfuzzer_test:op_msg_fuzzer]     #4 0x55aeb367f3e8 in mongo::Decorable<mongo::ThreadContext>::Decoration<mongo::(anonymous namespace)::ThreadNameSconce>::operator()(mongo::ThreadContext&) const /data/mci/cf4c5d480d66775786e7d4ad6010fa16/src/src/mongo/util/decorable.h:80
[cpp_libfuzzer_test:op_msg_fuzzer]     #5 0x55aeb367f3e8 in mongo::Decorable<mongo::ThreadContext>::Decoration<mongo::(anonymous namespace)::ThreadNameSconce>::operator()(mongo::ThreadContext*) const /data/mci/cf4c5d480d66775786e7d4ad6010fa16/src/src/mongo/util/decorable.h:84
[cpp_libfuzzer_test:op_msg_fuzzer]     #6 0x55aeb367e2b8 in mongo::ThreadName::set(boost::intrusive_ptr<mongo::ThreadContext>, boost::intrusive_ptr<mongo::ThreadName>) /data/mci/cf4c5d480d66775786e7d4ad6010fa16/src/src/mongo/util/concurrency/thread_name.cpp:204:20
[cpp_libfuzzer_test:op_msg_fuzzer]     #7 0x55aeb2be4801 in mongo::ClientStrand::_setCurrent() /data/mci/cf4c5d480d66775786e7d4ad6010fa16/src/src/mongo/db/client_strand.cpp:70:22
[cpp_libfuzzer_test:op_msg_fuzzer]     #8 0x55aea5292ef8 in mongo::ClientStrand::bind() /data/mci/cf4c5d480d66775786e7d4ad6010fa16/src/src/mongo/db/client_strand.h:151:16
[cpp_libfuzzer_test:op_msg_fuzzer]     #9 0x55aea5292ef8 in mongo::OpMsgFuzzerFixture::~OpMsgFuzzerFixture() /data/mci/cf4c5d480d66775786e7d4ad6010fa16/src/src/mongo/db/op_msg_fuzzer_fixture.cpp:133
[cpp_libfuzzer_test:op_msg_fuzzer]     #10 0x7f90028bc160 in __run_exit_handlers /build/glibc-S9d2JN/glibc-2.27/stdlib/exit.c:108
[cpp_libfuzzer_test:op_msg_fuzzer]     #11 0x7f90028bc259 in exit /build/glibc-S9d2JN/glibc-2.27/stdlib/exit.c:139
[cpp_libfuzzer_test:op_msg_fuzzer]     #12 0x55aea5129b40 in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) /data/mci/dfb39050aa31abca2c40371c920c586e/toolchain-builder/tmp/build-llvm.sh-GjU/llvm/projects/compiler-rt/lib/fuzzer/FuzzerDriver.cpp:768:5
[cpp_libfuzzer_test:op_msg_fuzzer]     #13 0x55aea5152462 in main /data/mci/dfb39050aa31abca2c40371c920c586e/toolchain-builder/tmp/build-llvm.sh-GjU/llvm/projects/compiler-rt/lib/fuzzer/FuzzerMain.cpp:20:10
[cpp_libfuzzer_test:op_msg_fuzzer]     #14 0x7f900289abf6 in __libc_start_main /build/glibc-S9d2JN/glibc-2.27/csu/../csu/libc-start.c:310
[cpp_libfuzzer_test:op_msg_fuzzer]     #15 0x55aea5106029 in _start (/data/mci/cf4c5d480d66775786e7d4ad6010fa16/src/build/install/bin/op_msg_fuzzer+0x32538029)

This ticket should investigate the root cause for this behavior and fix it.



 Comments   
Comment by Githook User [ 16/May/22 ]

Author:

{'name': 'Billy Donahue', 'email': 'billy.donahue@mongodb.com', 'username': 'BillyDonahue'}

Message: SERVER-66455 Revert SERVER-58194 `ThreadContext` present as cleaning up `OpMsgFuzzerFixture`
Branch: master
https://github.com/mongodb/mongo/commit/290133720f2c7c4ed7fd438d136809ba7935d530

Comment by Githook User [ 27/Oct/21 ]

Author:

{'name': 'Amirsaman Memaripour', 'email': 'amirsaman.memaripour@mongodb.com', 'username': 'samanca'}

Message: SERVER-58194 Have `ThreadContext` present as cleaning up `OpMsgFuzzerFixture`
Branch: master
https://github.com/mongodb/mongo/commit/4915e76b208b8923004fe2a99f05c663955cdb32

Comment by Amirsaman Memaripour [ 27/Oct/21 ]

This is an issue with the destruction order of objects with thread-local and static storage:

  • Each thread, including the main thread, has a thread-local instance of ThreadContext (see here).
  • The main thread for the fuzzer test creates a static instance of OpMsgFuzzerFixture, to ensure the fixture lasts throughout fuzzer tests (see here).

Based on the destruction order for thread-local and static storage (see below), the destructor for the thread-local (i.e., ThreadContext) is always invoked before the destructor for OpMsgFuzzerFixture, causing the undefined behavior reported by UBSAN.

Destructors for initialized objects with thread storage duration within a given thread are called as a result of returning from the initial function of that thread and as a result of that thread calling std​::​exit. The completions of the destructors for all initialized objects with thread storage duration within that thread strongly happen before the initiation of the destructors of any object with static storage duration.

One way to address the lifetime issue is to have a separate thread run the code that requires presence of ThreadContext.

Generated at Thu Feb 08 05:43:51 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.