[SERVER-44595] Linux shutdown of mongod sometimes never completes Created: 13/Nov/19  Updated: 08/Jan/24  Resolved: 12/Dec/19

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 4.2.1
Fix Version/s: 4.2.3, 4.3.3

Type: Bug Priority: Minor - P4
Reporter: Markus S Assignee: Benjamin Caimano (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.0, v3.6
Steps To Reproduce:

It happens sometimes and can not be reproduced.

Sprint: Service Arch 2019-11-18, Service Arch 2019-12-02, Service Arch 2019-12-16
Participants:

 Description   

After upgrading to v4.2.1 we started to get mongod processes hanging after receiving the shutdown command. 

A warning ahead, we are running Fedora 30 and using the prebuilt "rhel80" binaries.

Log:

2019-11-12T23:25:53.436+0100 I  NETWORK  [listener] connection accepted from 127.0.0.1:37382 #3 (2 connections now open)
2019-11-12T23:25:53.436+0100 I  NETWORK  [conn3] received client metadata from 127.0.0.1:37382 conn3: { application: { name: "MongoDB Shell" }, driver: { name: "MongoDB Internal Client", version: "4.2.1" }, os: { type: "Linux", name: "Fedora release 30 (Thirty)", architecture: "x86_64", versio                       n: "Kernel 5.3.8-200.fc30.x86_64" } }
2019-11-12T23:25:53.440+0100 I  COMMAND  [conn3] terminating, shutdown command received { shutdown: 1.0, lsid: { id: UUID("788c9cf7-ef90-44af-a3cb-bb28024fa869") }, $db: "admin" }
2019-11-12T23:25:53.440+0100 I  NETWORK  [conn3] shutdown: going to close listening sockets...
2019-11-12T23:25:53.440+0100 I  NETWORK  [conn3] removing socket file: /tmp/mongodb-23638.sock

  
Backtrace of mongod at this state:

(gdb) thread apply all backtrace
Thread 30 (Thread 0x7fef4b911700 (LWP 63259)):
#0  0x00007fef5b9c19f8 in __pthread_timedjoin_ex () from /lib64/libpthread.so.0
#1  0x00005569d17a4ae3 in std::thread::join() ()
#2  0x00005569d0dd8331 in mongo::transport::TransportLayerASIO::shutdown() ()
#3  0x00005569d0dcc839 in mongo::transport::TransportLayerManager::shutdown() ()
#4  0x00005569cfc28cf8 in mongo::(anonymous namespace)::shutdownTask(mongo::ShutdownTaskArgs const&) ()
#5  0x00005569d167a675 in mongo::(anonymous namespace)::runTasks(std::stack<mongo::unique_function<void (mongo::ShutdownTaskArgs const&)>, std::deque<mongo::unique_function<void (mongo::ShutdownTaskArgs const&)>, std::allocator<mongo::unique_function<void (mongo::ShutdownTaskArgs const&)> > >                        >, mongo::ShutdownTaskArgs const&) ()
#6  0x00005569cfba8cf7 in mongo::shutdown(mongo::ExitCode, mongo::ShutdownTaskArgs const&) ()
#7  0x00005569d030dbaa in mongo::CmdShutdown::shutdownHelper(mongo::BSONObj const&) ()
#8  0x00005569d00f2c1e in mongo::(anonymous namespace)::CmdShutdownMongoD::run(mongo::OperationContext*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, mongo::BSONObj const&, mongo::BSONObjBuilder&) ()
#9  0x00005569d108ed24 in mongo::BasicCommand::Invocation::run(mongo::OperationContext*, mongo::rpc::ReplyBuilderInterface*) ()
#10 0x00005569cfff2548 in mongo::(anonymous namespace)::runCommandImpl(mongo::OperationContext*, mongo::CommandInvocation*, mongo::OpMsgRequest const&, mongo::rpc::ReplyBuilderInterface*, mongo::LogicalTime, mongo::ServiceEntryPointCommon::Hooks const&, mongo::BSONObjBuilder*, mongo::Operation                       SessionInfoFromClient const&) ()
#11 0x00005569cfff4a34 in mongo::(anonymous namespace)::receivedCommands(mongo::OperationContext*, mongo::Message const&, mongo::ServiceEntryPointCommon::Hooks const&)::{lambda()#1}::operator()() const ()
#12 0x00005569cfff579a in mongo::ServiceEntryPointCommon::handleRequest(mongo::OperationContext*, mongo::Message const&, mongo::ServiceEntryPointCommon::Hooks const&) ()
#13 0x00005569cffe3aac in mongo::ServiceEntryPointMongod::handleRequest(mongo::OperationContext*, mongo::Message const&) ()
#14 0x00005569cffef94c in mongo::ServiceStateMachine::_processMessage(mongo::ServiceStateMachine::ThreadGuard) ()
#15 0x00005569cffeb16f in mongo::ServiceStateMachine::_runNextInGuard(mongo::ServiceStateMachine::ThreadGuard) ()
#16 0x00005569cffee54c in std::_Function_handler<void (), mongo::ServiceStateMachine::_scheduleNextWithGuard(mongo::ServiceStateMachine::ThreadGuard, mongo::transport::ServiceExecutor::ScheduleFlags, mongo::transport::ServiceExecutorTaskName, mongo::ServiceStateMachine::Ownership)::{lambda()#1                       }>::_M_invoke(std::_Any_data const&) ()
#17 0x00005569d0dd6aa2 in mongo::transport::ServiceExecutorSynchronous::schedule(std::function<void ()>, mongo::transport::ServiceExecutor::ScheduleFlags, mongo::transport::ServiceExecutorTaskName) ()
#18 0x00005569cffe8acd in mongo::ServiceStateMachine::_scheduleNextWithGuard(mongo::ServiceStateMachine::ThreadGuard, mongo::transport::ServiceExecutor::ScheduleFlags, mongo::transport::ServiceExecutorTaskName, mongo::ServiceStateMachine::Ownership) ()
#19 0x00005569cffebe23 in mongo::ServiceStateMachine::_sourceCallback(mongo::Status) ()
#20 0x00005569cffea197 in mongo::ServiceStateMachine::_sourceMessage(mongo::ServiceStateMachine::ThreadGuard) ()
#21 0x00005569cffeb0cb in mongo::ServiceStateMachine::_runNextInGuard(mongo::ServiceStateMachine::ThreadGuard) ()
#22 0x00005569cffee54c in std::_Function_handler<void (), mongo::ServiceStateMachine::_scheduleNextWithGuard(mongo::ServiceStateMachine::ThreadGuard, mongo::transport::ServiceExecutor::ScheduleFlags, mongo::transport::ServiceExecutorTaskName, mongo::ServiceStateMachine::Ownership)::{lambda()#1                       }>::_M_invoke(std::_Any_data const&) ()
#23 0x00005569d0dd6f0b in std::_Function_handler<void (), mongo::transport::ServiceExecutorSynchronous::schedule(std::function<void ()>, mongo::transport::ServiceExecutor::ScheduleFlags, mongo::transport::ServiceExecutorTaskName)::{lambda()#1}>::_M_invoke(std::_Any_data const&) ()
#24 0x00005569d1411874 in mongo::(anonymous namespace)::runFunc(void*) ()
#25 0x00007fef5b9c04c0 in start_thread () from /lib64/libpthread.so.0
#26 0x00007fef5b8ee553 in clone () from /lib64/libc.so.6
 
Thread 29 (Thread 0x7fef4ba12700 (LWP 62477)):
#0  0x00007fef5b9ca8bd in recvmsg () from /lib64/libpthread.so.0
#1  0x00005569d10705b0 in asio::detail::socket_ops::recv(int, iovec*, unsigned long, int, std::error_code&) ()
#2  0x00005569d1070668 in asio::detail::socket_ops::sync_recv(int, unsigned char, iovec*, unsigned long, int, bool, std::error_code&) ()
#3  0x00005569d0df8f5b in mongo::Future<void> mongo::transport::TransportLayerASIO::ASIOSession::opportunisticRead<asio::basic_stream_socket<asio::generic::stream_protocol>, asio::mutable_buffers_1>(asio::basic_stream_socket<asio::generic::stream_protocol>&, asio::mutable_buffers_1 const&, std                       ::shared_ptr<mongo::Baton> const&) ()
#4  0x00005569d0e03eba in mongo::Future<void> mongo::transport::TransportLayerASIO::ASIOSession::read<asio::mutable_buffers_1>(asio::mutable_buffers_1 const&, std::shared_ptr<mongo::Baton> const&) ()
#5  0x00005569d0e075c7 in mongo::transport::TransportLayerASIO::ASIOSession::sourceMessageImpl(std::shared_ptr<mongo::Baton> const&) ()
#6  0x00005569d0e07a88 in mongo::transport::TransportLayerASIO::ASIOSession::sourceMessage() ()
#7  0x00005569cffea337 in mongo::ServiceStateMachine::_sourceMessage(mongo::ServiceStateMachine::ThreadGuard) ()
#8  0x00005569cffeb0cb in mongo::ServiceStateMachine::_runNextInGuard(mongo::ServiceStateMachine::ThreadGuard) ()
#9  0x00005569cffee54c in std::_Function_handler<void (), mongo::ServiceStateMachine::_scheduleNextWithGuard(mongo::ServiceStateMachine::ThreadGuard, mongo::transport::ServiceExecutor::ScheduleFlags, mongo::transport::ServiceExecutorTaskName, mongo::ServiceStateMachine::Ownership)::{lambda()#1                       }>::_M_invoke(std::_Any_data const&) ()
#10 0x00005569d0dd6f0b in std::_Function_handler<void (), mongo::transport::ServiceExecutorSynchronous::schedule(std::function<void ()>, mongo::transport::ServiceExecutor::ScheduleFlags, mongo::transport::ServiceExecutorTaskName)::{lambda()#1}>::_M_invoke(std::_Any_data const&) ()
#11 0x00005569d1411874 in mongo::(anonymous namespace)::runFunc(void*) ()
#12 0x00007fef5b9c04c0 in start_thread () from /lib64/libpthread.so.0
#13 0x00007fef5b8ee553 in clone () from /lib64/libc.so.6
 
Thread 28 (Thread 0x7fef4c213700 (LWP 62168)):
#0  0x00007fef5b8ee88e in epoll_wait () from /lib64/libc.so.6
#1  0x00005569d106a29e in asio::detail::epoll_reactor::run(long, asio::detail::op_queue<asio::detail::scheduler_operation>&) ()
#2  0x00005569d106cc7d in asio::detail::scheduler::do_run_one(asio::detail::conditionally_enabled_mutex::scoped_lock&, asio::detail::scheduler_thread_info&, std::error_code const&) ()
#3  0x00005569d106d205 in asio::detail::scheduler::run(std::error_code&) ()
#4  0x00005569d1074e5e in asio::io_context::run() ()
#5  0x00005569d0dd9218 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<mongo::transport::TransportLayerASIO::start()::{lambda()#1}> > >::_M_run() ()
#6  0x00005569d17a4a7f in execute_native_thread_routine ()
#7  0x00007fef5b9c04c0 in start_thread () from /lib64/libpthread.so.0
#8  0x00007fef5b8ee553 in clone () from /lib64/libc.so.6
 
Thread 27 (Thread 0x7fef4da16700 (LWP 62165)):
--Type <RET> for more, q to quit, c to continue without paging--c
#0  0x00007fef5b9c973d in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00007fef5b9c2ce9 in pthread_mutex_lock () from /lib64/libpthread.so.0
#2  0x00005569d0dcd181 in mongo::transport::TransportLayerManager::makeBaton(mongo::OperationContext*) const ()
#3  0x00005569d158b011 in mongo::ServiceContext::makeOperationContext(mongo::Client*) ()
#4  0x00005569d1584787 in mongo::Client::makeOperationContext() ()
#5  0x00005569d0008120 in std::_Function_handler<void (mongo::Client*), mongo::PeriodicThreadToDecreaseSnapshotHistoryIfNotNeeded::_init(mongo::ServiceContext*)::{lambda(mongo::Client*)#1}>::_M_invoke(std::_Any_data const&, mongo::Client*&&) ()
#6  0x00005569d019bcb5 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<mongo::PeriodicRunnerImpl::PeriodicJobImpl::_run()::{lambda()#1}> > >::_M_run() ()
#7  0x00005569d17a4a7f in execute_native_thread_routine ()
#8  0x00007fef5b9c04c0 in start_thread () from /lib64/libpthread.so.0
#9  0x00007fef5b8ee553 in clone () from /lib64/libc.so.6
 
Thread 26 (Thread 0x7fef4e217700 (LWP 62164)):
#0  0x00007fef5b9c973d in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00007fef5b9c2ce9 in pthread_mutex_lock () from /lib64/libpthread.so.0
#2  0x00005569d0dcd181 in mongo::transport::TransportLayerManager::makeBaton(mongo::OperationContext*) const ()
#3  0x00005569d158b011 in mongo::ServiceContext::makeOperationContext(mongo::Client*) ()
#4  0x00005569d1584787 in mongo::Client::makeOperationContext() ()
#5  0x00005569d000ae19 in std::_Function_handler<void (mongo::Client*), mongo::PeriodicThreadToAbortExpiredTransactions::_init(mongo::ServiceContext*)::{lambda(mongo::Client*)#1}>::_M_invoke(std::_Any_data const&, mongo::Client*&&) ()
#6  0x00005569d019bcb5 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<mongo::PeriodicRunnerImpl::PeriodicJobImpl::_run()::{lambda()#1}> > >::_M_run() ()
#7  0x00005569d17a4a7f in execute_native_thread_routine ()
#8  0x00007fef5b9c04c0 in start_thread () from /lib64/libpthread.so.0
#9  0x00007fef5b8ee553 in clone () from /lib64/libc.so.6
 
Thread 25 (Thread 0x7fef4ea18700 (LWP 62163)):
#0  0x00007fef5b9c63c5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00005569d17a18dc in std::condition_variable::wait(std::unique_lock<std::mutex>&) ()
#2  0x00005569d0e98f6e in std::thread::_State_impl<std::thread::_Invoker<std::tuple<mongo::SessionKiller::SessionKiller(mongo::ServiceContext*, std::function<mongo::StatusWith<std::vector<mongo::HostAndPort, std::allocator<mongo::HostAndPort> > > (mongo::OperationContext*, mongo::SessionKiller::Matcher const&, std::linear_congruential_engine<unsigned long, 48271ul, 0ul, 2147483647ul>*)>)::{lambda()#1}> > >::_M_run() ()
#3  0x00005569d17a4a7f in execute_native_thread_routine ()
#4  0x00007fef5b9c04c0 in start_thread () from /lib64/libpthread.so.0
#5  0x00007fef5b8ee553 in clone () from /lib64/libc.so.6
 
Thread 24 (Thread 0x7fef4f219700 (LWP 62162)):
#0  0x00007fef5b9c677a in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00005569d155cafa in mongo::(anonymous namespace)::PeriodicTaskRunner::run() ()
#2  0x00005569d155de1c in mongo::BackgroundJob::jobBody() ()
#3  0x00005569d17a4a7f in execute_native_thread_routine ()
#4  0x00007fef5b9c04c0 in start_thread () from /lib64/libpthread.so.0
#5  0x00007fef5b8ee553 in clone () from /lib64/libc.so.6
 
Thread 23 (Thread 0x7fef5021b700 (LWP 62160)):
#0  0x00007fef5b9c973d in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00007fef5b9c2ce9 in pthread_mutex_lock () from /lib64/libpthread.so.0
#2  0x00005569d0dcd181 in mongo::transport::TransportLayerManager::makeBaton(mongo::OperationContext*) const ()
#3  0x00005569d158b011 in mongo::ServiceContext::makeOperationContext(mongo::Client*) ()
#4  0x00005569d1584787 in mongo::Client::makeOperationContext() ()
#5  0x00005569cffdd498 in mongo::TTLMonitor::doTTLPass() ()
#6  0x00005569cffde1d8 in mongo::TTLMonitor::run() ()
#7  0x00005569d155de1c in mongo::BackgroundJob::jobBody() ()
#8  0x00005569d17a4a7f in execute_native_thread_routine ()
#9  0x00007fef5b9c04c0 in start_thread () from /lib64/libpthread.so.0
#10 0x00007fef5b8ee553 in clone () from /lib64/libc.so.6
 
Thread 22 (Thread 0x7fef50a1c700 (LWP 62159)):
#0  0x00007fef5b9c973d in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00007fef5b9c2ce9 in pthread_mutex_lock () from /lib64/libpthread.so.0
#2  0x00005569d0dcd181 in mongo::transport::TransportLayerManager::makeBaton(mongo::OperationContext*) const ()
#3  0x00005569d158b011 in mongo::ServiceContext::makeOperationContext(mongo::Client*) ()
#4  0x00005569d1584787 in mongo::Client::makeOperationContext() ()
#5  0x00005569d00b0a0b in mongo::FTDCCollectorCollection::collect(mongo::Client*) ()
#6  0x00005569d0094598 in mongo::FreeMonProcessor::doMetricsCollect(mongo::Client*) ()
#7  0x00005569d009aedb in mongo::FreeMonProcessor::run() ()
#8  0x00005569d17a4a7f in execute_native_thread_routine ()
#9  0x00007fef5b9c04c0 in start_thread () from /lib64/libpthread.so.0
#10 0x00007fef5b8ee553 in clone () from /lib64/libc.so.6
 
Thread 21 (Thread 0x7fef5121d700 (LWP 62157)):
#0  0x00007fef5b9c63c5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00005569d17a18dc in std::condition_variable::wait(std::unique_lock<std::mutex>&) ()
#2  0x00005569d04f0fad in mongo::ThreadPool::_consumeTasks() ()
#3  0x00005569d04f25e5 in mongo::ThreadPool::_workerThreadBody(mongo::ThreadPool*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) ()
#4  0x00005569d17a4a7f in execute_native_thread_routine ()
#5  0x00007fef5b9c04c0 in start_thread () from /lib64/libpthread.so.0
#6  0x00007fef5b8ee553 in clone () from /lib64/libc.so.6
 
Thread 20 (Thread 0x7fef51a1e700 (LWP 62156)):
#0  0x00007fef5b9c63c5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00005569d106cd9b in asio::detail::scheduler::do_run_one(asio::detail::conditionally_enabled_mutex::scoped_lock&, asio::detail::scheduler_thread_info&, std::error_code const&) ()
#2  0x00005569d106d205 in asio::detail::scheduler::run(std::error_code&) ()
#3  0x00005569d1074e5e in asio::io_context::run() ()
#4  0x00005569d0de2b7d in mongo::transport::TransportLayerASIO::ASIOReactor::run() ()
#5  0x00005569d0dc0484 in mongo::executor::NetworkInterfaceTL::_run() ()
#6  0x00005569d17a4a7f in execute_native_thread_routine ()
#7  0x00007fef5b9c04c0 in start_thread () from /lib64/libpthread.so.0
#8  0x00007fef5b8ee553 in clone () from /lib64/libc.so.6
 
Thread 19 (Thread 0x7fef5221f700 (LWP 62155)):
#0  0x00007fef5b9c973d in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00007fef5b9c2ce9 in pthread_mutex_lock () from /lib64/libpthread.so.0
#2  0x00005569d0dcd181 in mongo::transport::TransportLayerManager::makeBaton(mongo::OperationContext*) const ()
#3  0x00005569d158b011 in mongo::ServiceContext::makeOperationContext(mongo::Client*) ()
#4  0x00005569d1584787 in mongo::Client::makeOperationContext() ()
#5  0x00005569d00b0a0b in mongo::FTDCCollectorCollection::collect(mongo::Client*) ()
#6  0x00005569d00b4deb in mongo::FTDCController::doLoop() ()
#7  0x00005569d17a4a7f in execute_native_thread_routine ()
#8  0x00007fef5b9c04c0 in start_thread () from /lib64/libpthread.so.0
#9  0x00007fef5b8ee553 in clone () from /lib64/libc.so.6
 
Thread 18 (Thread 0x7fef53221700 (LWP 62143)):
#0  0x00007fef5b9c63c5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00005569d106cd9b in asio::detail::scheduler::do_run_one(asio::detail::conditionally_enabled_mutex::scoped_lock&, asio::detail::scheduler_thread_info&, std::error_code const&) ()
#2  0x00005569d106d205 in asio::detail::scheduler::run(std::error_code&) ()
#3  0x00005569d1074e5e in asio::io_context::run() ()
#4  0x00005569d0de2b7d in mongo::transport::TransportLayerASIO::ASIOReactor::run() ()
#5  0x00005569d0dc0484 in mongo::executor::NetworkInterfaceTL::_run() ()
#6  0x00005569d17a4a7f in execute_native_thread_routine ()
#7  0x00007fef5b9c04c0 in start_thread () from /lib64/libpthread.so.0
#8  0x00007fef5b8ee553 in clone () from /lib64/libc.so.6
 
Thread 17 (Thread 0x7fef53a22700 (LWP 61903)):
#0  0x00007fef5b9c677a in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00005569d05fda9c in mongo::DeadlineMonitor<mongo::mozjs::MozJSImplScope>::deadlineMonitorThread() ()
#2  0x00005569d17a4a7f in execute_native_thread_routine ()
#3  0x00007fef5b9c04c0 in start_thread () from /lib64/libpthread.so.0
#4  0x00007fef5b8ee553 in clone () from /lib64/libc.so.6
 
Thread 16 (Thread 0x7fef54223700 (LWP 61901)):
#0  0x00007fef5b9c973d in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00007fef5b9c2ce9 in pthread_mutex_lock () from /lib64/libpthread.so.0
#2  0x00005569d0dcd181 in mongo::transport::TransportLayerManager::makeBaton(mongo::OperationContext*) const ()
#3  0x00005569d158b011 in mongo::ServiceContext::makeOperationContext(mongo::Client*) ()
#4  0x00005569d1584787 in mongo::Client::makeOperationContext() ()
#5  0x00005569d042e22a in std::_Function_handler<void (mongo::Client*), mongo::StorageEngineImpl::TimestampMonitor::startup()::{lambda(mongo::Client*)#1}>::_M_invoke(std::_Any_data const&, mongo::Client*&&) ()
#6  0x00005569d019bcb5 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<mongo::PeriodicRunnerImpl::PeriodicJobImpl::_run()::{lambda()#1}> > >::_M_run() ()
#7  0x00005569d17a4a7f in execute_native_thread_routine ()
#8  0x00007fef5b9c04c0 in start_thread () from /lib64/libpthread.so.0
#9  0x00007fef5b8ee553 in clone () from /lib64/libc.so.6
 
Thread 15 (Thread 0x7fef54a24700 (LWP 61880)):
#0  0x00007fef5b9c677a in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00005569cfc7dd35 in mongo::WiredTigerKVEngine::WiredTigerCheckpointThread::run() ()
#2  0x00005569d155de1c in mongo::BackgroundJob::jobBody() ()
#3  0x00005569d17a4a7f in execute_native_thread_routine ()
#4  0x00007fef5b9c04c0 in start_thread () from /lib64/libpthread.so.0
#5  0x00007fef5b8ee553 in clone () from /lib64/libc.so.6
 
Thread 14 (Thread 0x7fef55225700 (LWP 61879)):
#0  0x00007fef5b9ca0a5 in nanosleep () from /lib64/libpthread.so.0
#1  0x00005569d1685a0e in mongo::sleepmillis(long long) ()
#2  0x00005569cfc7d7ec in mongo::WiredTigerKVEngine::WiredTigerJournalFlusher::run() ()
#3  0x00005569d155de1c in mongo::BackgroundJob::jobBody() ()
#4  0x00005569d17a4a7f in execute_native_thread_routine ()
#5  0x00007fef5b9c04c0 in start_thread () from /lib64/libpthread.so.0
#6  0x00007fef5b8ee553 in clone () from /lib64/libc.so.6
 
Thread 13 (Thread 0x7fef55a26700 (LWP 61878)):
#0  0x00007fef5b9c677a in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00005569cfc7d378 in mongo::WiredTigerKVEngine::WiredTigerSessionSweeper::run() ()
#2  0x00005569d155de1c in mongo::BackgroundJob::jobBody() ()
#3  0x00005569d17a4a7f in execute_native_thread_routine ()
#4  0x00007fef5b9c04c0 in start_thread () from /lib64/libpthread.so.0
#5  0x00007fef5b8ee553 in clone () from /lib64/libc.so.6
 
Thread 12 (Thread 0x7fef56227700 (LWP 61877)):
#0  0x00007fef5b9c672c in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00005569cfcde55b in __wt_cond_wait_signal ()
#2  0x00005569cfcba36c in __sweep_server ()
#3  0x00007fef5b9c04c0 in start_thread () from /lib64/libpthread.so.0
#4  0x00007fef5b8ee553 in clone () from /lib64/libc.so.6
 
Thread 11 (Thread 0x7fef56a28700 (LWP 61876)):
#0  0x00007fef5b9c672c in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00005569cfcde55b in __wt_cond_wait_signal ()
#2  0x00005569cfd08a46 in __wt_cond_auto_wait_signal ()
#3  0x00005569cfd08a93 in __wt_cond_auto_wait ()
#4  0x00005569cfcc9f7c in __wt_evict_thread_run ()
#5  0x00005569cfd167e9 in __thread_run ()
#6  0x00007fef5b9c04c0 in start_thread () from /lib64/libpthread.so.0
#7  0x00007fef5b8ee553 in clone () from /lib64/libc.so.6
 
Thread 10 (Thread 0x7fef57229700 (LWP 61875)):
#0  0x00007fef5b9c672c in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00005569cfcde55b in __wt_cond_wait_signal ()
#2  0x00005569cfd08a46 in __wt_cond_auto_wait_signal ()
#3  0x00005569cfd08a93 in __wt_cond_auto_wait ()
#4  0x00005569cfcc9f7c in __wt_evict_thread_run ()
#5  0x00005569cfd167e9 in __thread_run ()
#6  0x00007fef5b9c04c0 in start_thread () from /lib64/libpthread.so.0
#7  0x00007fef5b8ee553 in clone () from /lib64/libc.so.6
 
Thread 9 (Thread 0x7fef57a2a700 (LWP 61874)):
#0  0x00007fef5b9c672c in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00005569cfcde55b in __wt_cond_wait_signal ()
#2  0x00005569cfd08a46 in __wt_cond_auto_wait_signal ()
#3  0x00005569cfd08a93 in __wt_cond_auto_wait ()
#4  0x00005569cfcc9f7c in __wt_evict_thread_run ()
#5  0x00005569cfd167e9 in __thread_run ()
#6  0x00007fef5b9c04c0 in start_thread () from /lib64/libpthread.so.0
#7  0x00007fef5b8ee553 in clone () from /lib64/libc.so.6
 
Thread 8 (Thread 0x7fef5822b700 (LWP 61873)):
#0  0x00007fef5b9c672c in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00005569cfcde55b in __wt_cond_wait_signal ()
#2  0x00005569cfd08a46 in __wt_cond_auto_wait_signal ()
#3  0x00005569cfd08a93 in __wt_cond_auto_wait ()
#4  0x00005569cfcc9f7c in __wt_evict_thread_run ()
#5  0x00005569cfd167e9 in __thread_run ()
#6  0x00007fef5b9c04c0 in start_thread () from /lib64/libpthread.so.0
#7  0x00007fef5b8ee553 in clone () from /lib64/libc.so.6
 
Thread 7 (Thread 0x7fef58a2c700 (LWP 61856)):
#0  0x00007fef5b9c672c in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00005569cfcde55b in __wt_cond_wait_signal ()
#2  0x00005569cfd08a46 in __wt_cond_auto_wait_signal ()
#3  0x00005569cfd8d94a in __log_server ()
#4  0x00007fef5b9c04c0 in start_thread () from /lib64/libpthread.so.0
#5  0x00007fef5b8ee553 in clone () from /lib64/libc.so.6
 
Thread 6 (Thread 0x7fef5922d700 (LWP 61855)):
#0  0x00007fef5b9c672c in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00005569cfcde55b in __wt_cond_wait_signal ()
#2  0x00005569cfd08a46 in __wt_cond_auto_wait_signal ()
#3  0x00005569cfd08a93 in __wt_cond_auto_wait ()
#4  0x00005569cfd8ea93 in __log_wrlsn_server ()
#5  0x00007fef5b9c04c0 in start_thread () from /lib64/libpthread.so.0
#6  0x00007fef5b8ee553 in clone () from /lib64/libc.so.6
 
Thread 5 (Thread 0x7fef59a2e700 (LWP 61854)):
#0  0x00007fef5b9c672c in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00005569cfcde55b in __wt_cond_wait_signal ()
#2  0x00005569cfd8dc2b in __log_file_server ()
#3  0x00007fef5b9c04c0 in start_thread () from /lib64/libpthread.so.0
#4  0x00007fef5b8ee553 in clone () from /lib64/libc.so.6
 
Thread 4 (Thread 0x7fef5a22f700 (LWP 61464)):
#0  0x00007fef5b9c677a in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00005569d158e817 in mongo::ClockSource::waitForConditionUntil(mongo::stdx::condition_variable&, std::unique_lock<std::mutex>&, mongo::Date_t, mongo::Waitable*) ()
#2  0x00005569d019bf28 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<mongo::PeriodicRunnerImpl::PeriodicJobImpl::_run()::{lambda()#1}> > >::_M_run() ()
#3  0x00005569d17a4a7f in execute_native_thread_routine ()
#4  0x00007fef5b9c04c0 in start_thread () from /lib64/libpthread.so.0
#5  0x00007fef5b8ee553 in clone () from /lib64/libc.so.6
 
Thread 3 (Thread 0x7fef5aa30700 (LWP 61463)):
#0  0x00007fef5b9c63c5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00005569d17a18dc in std::condition_variable::wait(std::unique_lock<std::mutex>&) ()
#2  0x00005569d158dd97 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<mongo::BackgroundThreadClockSource::_startTimerThread()::{lambda()#1}> > >::_M_run() ()
#3  0x00005569d17a4a7f in execute_native_thread_routine ()
#4  0x00007fef5b9c04c0 in start_thread () from /lib64/libpthread.so.0
#5  0x00007fef5b8ee553 in clone () from /lib64/libc.so.6
 
Thread 2 (Thread 0x7fef5b231700 (LWP 61462)):
#0  0x00007fef5b9c63c5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00005569d17a18dc in std::condition_variable::wait(std::unique_lock<std::mutex>&) ()
#2  0x00005569cfba8a4c in mongo::shutdown(mongo::ExitCode, mongo::ShutdownTaskArgs const&) ()
#3  0x00005569d031c903 in mongo::(anonymous namespace)::signalProcessingThread(mongo::LogFileStatus) ()
#4  0x00005569d17a4a7f in execute_native_thread_routine ()
#5  0x00007fef5b9c04c0 in start_thread () from /lib64/libpthread.so.0
#6  0x00007fef5b8ee553 in clone () from /lib64/libc.so.6
 
Thread 1 (Thread 0x7fef5b232c00 (LWP 61455)):
#0  0x00007fef5b9c63c5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00005569d17a18dc in std::condition_variable::wait(std::unique_lock<std::mutex>&) ()
#2  0x00005569d167a76f in mongo::waitForShutdown() ()
#3  0x00005569cfc2b786 in mongo::(anonymous namespace)::_initAndListen(int) [clone .isra.508] ()
#4  0x00005569cfc2cf5d in mongo::(anonymous namespace)::mongoDbMain(int, char**, char**) ()
#5  0x00005569cfbb25b9 in main ()



 Comments   
Comment by Markus S [ 13/Dec/19 ]

Thank you for the quick fix!

The problem never happened again with the binaries you provided me with, so I assume the problem is solved.

About the backport, for me it is no big problem if it is not backported. We are planning to update our existing old DB's to the latest version.

Comment by Benjamin Caimano (Inactive) [ 12/Dec/19 ]

On another note, thank you for the updates on v4.0 and v3.6 failures. This lines up more closely with what I would expect. It's always a pleasure to have such a responsible reporter.

Comment by Benjamin Caimano (Inactive) [ 12/Dec/19 ]

markus_schoder@gmx.at, I've released our fix and backported it to v4.2. I expect it will go out with r4.2.3. v3.6 and v4.0 still had the same general form of our transport layer but v4.0 is much less like master than v4.2. I've requested backports for v4.0 and v3.6, but I can't swear to you that we'll actually do them.

Please feel free to reopen this if you feel we haven't addressed your issues or you have additional concerns.

Comment by Githook User [ 12/Dec/19 ]

Author:

{'name': 'Ben Caimano', 'email': 'ben.caimano@mongodb.com', 'username': 'bcaimano'}

Message: SERVER-44595 Clarify TransportLayer shutdown
Branch: v4.2
https://github.com/mongodb/mongo/commit/7cf450d56329ad36b9a9fc1ac345b308df8548b8

Comment by Githook User [ 12/Dec/19 ]

Author:

{'name': 'Ben Caimano', 'email': 'ben.caimano@mongodb.com', 'username': 'bcaimano'}

Message: SERVER-44595 Clarify TransportLayer shutdown
Branch: master
https://github.com/mongodb/mongo/commit/206b9653bf997922f078b9af4b8f670ba594823a

Comment by Markus S [ 10/Dec/19 ]

Today the issue appeared the first time on mongod v3.6.14. Just to let you know this seems not to be limited to v4.

Thread 23 (Thread 0x7f2bf4c8e700 (LWP 65420)):
#0  0x00007f2c00694a08 in __pthread_timedjoin_ex () from /lib64/libpthread.so.0
#1  0x0000555adaeb8d67 in std::thread::join() ()
#2  0x0000555ada8c27a3 in mongo::transport::TransportLayerASIO::shutdown() ()
#3  0x0000555ad953c689 in mongo::transport::TransportLayerManager::shutdown() ()
#4  0x0000555ad95368c2 in mongo::(anonymous namespace)::shutdownTask() ()
#5  0x0000555adada50a2 in mongo::(anonymous namespace)::runTasks(std::stack<std::function<void ()>, std::deque<std::function<void ()>, std::allocator<std::function<void ()> > > >) [clone .constprop.39] ()
#6  0x0000555ad94c2959 in mongo::shutdown(mongo::ExitCode) ()
#7  0x0000555ada4c60bf in mongo::CmdShutdown::shutdownHelper() ()
#8  0x0000555ad97a4945 in mongo::CmdShutdownMongoD::run(mongo::OperationContext*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, mongo::BSONObj const&, mongo::BSONObjBuilder&) ()
#9  0x0000555ada6c0496 in mongo::BasicCommand::enhancedRun(mongo::OperationContext*, mongo::OpMsgRequest const&, mongo::BSONObjBuilder&) ()
#10 0x0000555ada6bb8ff in mongo::Command::publicRun(mongo::OperationContext*, mongo::OpMsgRequest const&, mongo::BSONObjBuilder&) ()
#11 0x0000555ad976225b in mongo::(anonymous namespace)::execCommandDatabase(mongo::OperationContext*, mongo::Command*, mongo::OpMsgRequest const&, mongo::rpc::ReplyBuilderInterface*) [clone .constprop.284] ()
#12 0x0000555ad976407c in mongo::(anonymous namespace)::runCommands(mongo::OperationContext*, mongo::Message const&)::{lambda()#1}::operator()() const ()
#13 0x0000555ad9764ed4 in mongo::ServiceEntryPointMongod::handleRequest(mongo::OperationContext*, mongo::Message const&) ()
#14 0x0000555ad9772f5a in mongo::ServiceStateMachine::_processMessage(mongo::ServiceStateMachine::ThreadGuard) ()
#15 0x0000555ad976e8b7 in mongo::ServiceStateMachine::_runNextInGuard(mongo::ServiceStateMachine::ThreadGuard) ()
#16 0x0000555ad9771d41 in std::_Function_handler<void (), mongo::ServiceStateMachine::_scheduleNextWithGuard(mongo::ServiceStateMachine::ThreadGuard, mongo::transport::ServiceExecutor::ScheduleFlags, mongo::transport::ServiceExecutorTaskName, mongo::ServiceStateMachine::Ownership)::{lambda()#1}>::_M_invoke(std::_Any_data const&) ()
#17 0x0000555ada6944d2 in mongo::transport::ServiceExecutorSynchronous::schedule(std::function<void ()>, mongo::transport::ServiceExecutor::ScheduleFlags, mongo::transport::ServiceExecutorTaskName) ()
#18 0x0000555ad976d6f0 in mongo::ServiceStateMachine::_scheduleNextWithGuard(mongo::ServiceStateMachine::ThreadGuard, mongo::transport::ServiceExecutor::ScheduleFlags, mongo::transport::ServiceExecutorTaskName, mongo::ServiceStateMachine::Ownership) ()
#19 0x0000555ad976fc85 in mongo::ServiceStateMachine::_sourceCallback(mongo::Status) ()
#20 0x0000555ad9770581 in mongo::ServiceStateMachine::_sourceMessage(mongo::ServiceStateMachine::ThreadGuard) ()
#21 0x0000555ad976e93d in mongo::ServiceStateMachine::_runNextInGuard(mongo::ServiceStateMachine::ThreadGuard) ()
#22 0x0000555ad9771d41 in std::_Function_handler<void (), mongo::ServiceStateMachine::_scheduleNextWithGuard(mongo::ServiceStateMachine::ThreadGuard, mongo::transport::ServiceExecutor::ScheduleFlags, mongo::transport::ServiceExecutorTaskName, mongo::ServiceStateMachine::Ownership)::{lambda()#1}>::_M_invoke(std::_Any_data const&) ()
#23 0x0000555ada694a35 in std::_Function_handler<void (), mongo::transport::ServiceExecutorSynchronous::schedule(std::function<void ()>, mongo::transport::ServiceExecutor::ScheduleFlags, mongo::transport::ServiceExecutorTaskName)::{lambda()#1}>::_M_invoke(std::_Any_data const&) ()
#24 0x0000555adac63334 in mongo::(anonymous namespace)::runFunc(void*) ()
#25 0x00007f2c006934c0 in start_thread () from /lib64/libpthread.so.0
#26 0x00007f2c005c1163 in clone () from /lib64/libc.so.6
 
Thread 22 (Thread 0x7f2bf548f700 (LWP 65224)):
#0  0x00007f2c005c149e in epoll_wait () from /lib64/libc.so.6
#1  0x0000555ada8cf83e in asio::detail::epoll_reactor::run(long, asio::detail::op_queue<asio::detail::scheduler_operation>&) ()
#2  0x0000555ada8d0eee in asio::detail::scheduler::do_run_one(asio::detail::conditionally_enabled_mutex::scoped_lock&, asio::detail::scheduler_thread_info&, std::error_code const&) ()
#3  0x0000555ada8d1441 in asio::detail::scheduler::run(std::error_code&) ()
#4  0x0000555ada8db56e in asio::io_context::run() ()
#5  0x0000555ada8c307e in std::thread::_Impl<std::_Bind_simple<mongo::transport::TransportLayerASIO::start()::{lambda()#1} ()> >::_M_run() ()
#6  0x0000555adaeb8e20 in execute_native_thread_routine ()
#7  0x00007f2c006934c0 in start_thread () from /lib64/libpthread.so.0
#8  0x00007f2c005c1163 in clone () from /lib64/libc.so.6
 
Thread 21 (Thread 0x7f2bf5c90700 (LWP 65223)):
#0  0x00007f2c0069978a in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x0000555adac62fd0 in mongo::ClockSource::waitForConditionUntil(std::condition_variable&, std::unique_lock<std::mutex>&, mongo::Date_t) ()
#2  0x0000555ad9bc9109 in std::thread::_Impl<std::_Bind_simple<mongo::PeriodicRunnerImpl::PeriodicJobImpl::run()::{lambda()#1} ()> >::_M_run() ()
#3  0x0000555adaeb8e20 in execute_native_thread_routine ()
#4  0x00007f2c006934c0 in start_thread () from /lib64/libpthread.so.0
#5  0x00007f2c005c1163 in clone () from /lib64/libc.so.6
 
Thread 20 (Thread 0x7f2bf6491700 (LWP 65222)):
#0  0x00007f2c0069978a in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x0000555adac62fd0 in mongo::ClockSource::waitForConditionUntil(std::condition_variable&, std::unique_lock<std::mutex>&, mongo::Date_t) ()
#2  0x0000555ad9bc9109 in std::thread::_Impl<std::_Bind_simple<mongo::PeriodicRunnerImpl::PeriodicJobImpl::run()::{lambda()#1} ()> >::_M_run() ()
#3  0x0000555adaeb8e20 in execute_native_thread_routine ()
#4  0x00007f2c006934c0 in start_thread () from /lib64/libpthread.so.0
#5  0x00007f2c005c1163 in clone () from /lib64/libc.so.6
 
Thread 19 (Thread 0x7f2bf6c92700 (LWP 65221)):
#0  0x00007f2c006993d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x0000555adaeb5e5c in std::condition_variable::wait(std::unique_lock<std::mutex>&) ()
#2  0x0000555ada6a2cf5 in std::thread::_Impl<std::_Bind_simple<mongo::SessionKiller::SessionKiller(mongo::ServiceContext*, std::function<mongo::StatusWith<std::vector<mongo::HostAndPort, std::allocator<mongo::HostAndPort> > > (mongo::OperationContext*, mongo::SessionKiller::Matcher const&, std::linear_congruential_engine<unsigned long, 48271ul, 0ul, 2147483647ul>*)>)::{lambda()#1} ()> >::_M_run() ()
#3  0x0000555adaeb8e20 in execute_native_thread_routine ()
#4  0x00007f2c006934c0 in start_thread () from /lib64/libpthread.so.0
#5  0x00007f2c005c1163 in clone () from /lib64/libc.so.6
 
Thread 18 (Thread 0x7f2bf7493700 (LWP 65220)):
#0  0x00007f2c0069978a in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x0000555adac9fbc4 in mongo::(anonymous namespace)::PeriodicTaskRunner::run() ()
--Type <RET> for more, q to quit, c to continue without paging--c
#2  0x0000555adac9e8c1 in mongo::BackgroundJob::jobBody() ()
#3  0x0000555adaeb8e20 in execute_native_thread_routine ()
#4  0x00007f2c006934c0 in start_thread () from /lib64/libpthread.so.0
#5  0x00007f2c005c1163 in clone () from /lib64/libc.so.6
 
Thread 17 (Thread 0x7f2bf8c96700 (LWP 65217)):
#0  0x00007f2c0069978a in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x0000555ada4a8f69 in mongo::FTDCController::doLoop() ()
#2  0x0000555adaeb8e20 in execute_native_thread_routine ()
#3  0x00007f2c006934c0 in start_thread () from /lib64/libpthread.so.0
#4  0x00007f2c005c1163 in clone () from /lib64/libc.so.6
 
Thread 16 (Thread 0x7f2bf9497700 (LWP 64984)):
#0  0x00007f2c006993d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x0000555ada8d10ab in asio::detail::scheduler::do_run_one(asio::detail::conditionally_enabled_mutex::scoped_lock&, asio::detail::scheduler_thread_info&, std::error_code const&) ()
#2  0x0000555ada8d1441 in asio::detail::scheduler::run(std::error_code&) ()
#3  0x0000555ada814b39 in mongo::executor::NetworkInterfaceASIO::startup()::{lambda()#1}::operator()() const [clone .constprop.370] ()
#4  0x0000555adaeb8e20 in execute_native_thread_routine ()
#5  0x00007f2c006934c0 in start_thread () from /lib64/libpthread.so.0
#6  0x00007f2c005c1163 in clone () from /lib64/libc.so.6
 
Thread 15 (Thread 0x7f2bf9c98700 (LWP 64982)):
#0  0x00007f2c0069978a in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x0000555ad9da2767 in mongo::DeadlineMonitor<mongo::mozjs::MozJSImplScope>::deadlineMonitorThread() ()
#2  0x0000555adaeb8e20 in execute_native_thread_routine ()
#3  0x00007f2c006934c0 in start_thread () from /lib64/libpthread.so.0
#4  0x00007f2c005c1163 in clone () from /lib64/libc.so.6
 
Thread 14 (Thread 0x7f2bfa499700 (LWP 64907)):
#0  0x00007f2c0069978a in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x0000555ad9577c4c in mongo::WiredTigerKVEngine::WiredTigerCheckpointThread::run() ()
#2  0x0000555adac9e8c1 in mongo::BackgroundJob::jobBody() ()
#3  0x0000555adaeb8e20 in execute_native_thread_routine ()
#4  0x00007f2c006934c0 in start_thread () from /lib64/libpthread.so.0
#5  0x00007f2c005c1163 in clone () from /lib64/libc.so.6
 
Thread 13 (Thread 0x7f2bfac9a700 (LWP 64906)):
#0  0x00007f2c0069d0b5 in nanosleep () from /lib64/libpthread.so.0
#1  0x0000555adadb1d83 in mongo::sleepmillis(long long) ()
#2  0x0000555ad9577653 in mongo::WiredTigerKVEngine::WiredTigerJournalFlusher::run() ()
#3  0x0000555adac9e8c1 in mongo::BackgroundJob::jobBody() ()
#4  0x0000555adaeb8e20 in execute_native_thread_routine ()
#5  0x00007f2c006934c0 in start_thread () from /lib64/libpthread.so.0
#6  0x00007f2c005c1163 in clone () from /lib64/libc.so.6
 
Thread 12 (Thread 0x7f2bfb49b700 (LWP 64905)):
#0  0x00007f2c0069978a in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x0000555ad95770ab in mongo::WiredTigerKVEngine::WiredTigerSessionSweeper::run() ()
#2  0x0000555adac9e8c1 in mongo::BackgroundJob::jobBody() ()
#3  0x0000555adaeb8e20 in execute_native_thread_routine ()
#4  0x00007f2c006934c0 in start_thread () from /lib64/libpthread.so.0
#5  0x00007f2c005c1163 in clone () from /lib64/libc.so.6
 
Thread 11 (Thread 0x7f2bfbc9c700 (LWP 64904)):
#0  0x00007f2c0069973c in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x0000555ad95d0759 in __wt_cond_wait_signal ()
#2  0x0000555ad95ab86e in __sweep_server ()
#3  0x00007f2c006934c0 in start_thread () from /lib64/libpthread.so.0
#4  0x00007f2c005c1163 in clone () from /lib64/libc.so.6
 
Thread 10 (Thread 0x7f2bfc49d700 (LWP 64903)):
#0  0x00007f2c0069973c in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x0000555ad95d0759 in __wt_cond_wait_signal ()
#2  0x0000555ad960458c in __wt_cond_auto_wait_signal ()
#3  0x0000555ad96045d3 in __wt_cond_auto_wait ()
#4  0x0000555ad95bb16e in __wt_evict_thread_run ()
#5  0x0000555ad9611b29 in __thread_run ()
#6  0x00007f2c006934c0 in start_thread () from /lib64/libpthread.so.0
#7  0x00007f2c005c1163 in clone () from /lib64/libc.so.6
 
Thread 9 (Thread 0x7f2bfcc9e700 (LWP 64902)):
#0  0x00007f2c0069973c in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x0000555ad95d0759 in __wt_cond_wait_signal ()
#2  0x0000555ad960458c in __wt_cond_auto_wait_signal ()
#3  0x0000555ad96045d3 in __wt_cond_auto_wait ()
#4  0x0000555ad95bb16e in __wt_evict_thread_run ()
#5  0x0000555ad9611b29 in __thread_run ()
#6  0x00007f2c006934c0 in start_thread () from /lib64/libpthread.so.0
#7  0x00007f2c005c1163 in clone () from /lib64/libc.so.6
 
Thread 8 (Thread 0x7f2bfd49f700 (LWP 64901)):
#0  0x00007f2c0069973c in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x0000555ad95d0759 in __wt_cond_wait_signal ()
#2  0x0000555ad960458c in __wt_cond_auto_wait_signal ()
#3  0x0000555ad96045d3 in __wt_cond_auto_wait ()
#4  0x0000555ad95bb16e in __wt_evict_thread_run ()
#5  0x0000555ad9611b29 in __thread_run ()
#6  0x00007f2c006934c0 in start_thread () from /lib64/libpthread.so.0
#7  0x00007f2c005c1163 in clone () from /lib64/libc.so.6
 
Thread 7 (Thread 0x7f2bfdca0700 (LWP 64900)):
#0  0x00007f2c0069973c in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x0000555ad95d0759 in __wt_cond_wait_signal ()
#2  0x0000555ad960458c in __wt_cond_auto_wait_signal ()
#3  0x0000555ad96045d3 in __wt_cond_auto_wait ()
#4  0x0000555ad95bb16e in __wt_evict_thread_run ()
#5  0x0000555ad9611b29 in __thread_run ()
#6  0x00007f2c006934c0 in start_thread () from /lib64/libpthread.so.0
#7  0x00007f2c005c1163 in clone () from /lib64/libc.so.6
 
Thread 6 (Thread 0x7f2bfe4a1700 (LWP 64889)):
#0  0x00007f2c0069973c in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x0000555ad95d0759 in __wt_cond_wait_signal ()
#2  0x0000555ad960458c in __wt_cond_auto_wait_signal ()
#3  0x0000555ad9664345 in __log_server ()
#4  0x00007f2c006934c0 in start_thread () from /lib64/libpthread.so.0
#5  0x00007f2c005c1163 in clone () from /lib64/libc.so.6
 
Thread 5 (Thread 0x7f2bfeca2700 (LWP 64888)):
#0  0x00007f2c0069973c in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x0000555ad95d0759 in __wt_cond_wait_signal ()
#2  0x0000555ad960458c in __wt_cond_auto_wait_signal ()
#3  0x0000555ad96045d3 in __wt_cond_auto_wait ()
#4  0x0000555ad9665993 in __log_wrlsn_server ()
#5  0x00007f2c006934c0 in start_thread () from /lib64/libpthread.so.0
#6  0x00007f2c005c1163 in clone () from /lib64/libc.so.6
 
Thread 4 (Thread 0x7f2bff4a3700 (LWP 64887)):
#0  0x00007f2c0069973c in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x0000555ad95d0759 in __wt_cond_wait_signal ()
#2  0x0000555ad96649db in __log_file_server ()
#3  0x00007f2c006934c0 in start_thread () from /lib64/libpthread.so.0
#4  0x00007f2c005c1163 in clone () from /lib64/libc.so.6
 
Thread 3 (Thread 0x7f2bffca4700 (LWP 64529)):
#0  0x00007f2c006993d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x0000555adaeb5e5c in std::condition_variable::wait(std::unique_lock<std::mutex>&) ()
#2  0x0000555adac62a6d in std::thread::_Impl<std::_Bind_simple<mongo::BackgroundThreadClockSource::_startTimerThread()::{lambda()#1} ()> >::_M_run() ()
#3  0x0000555adaeb8e20 in execute_native_thread_routine ()
#4  0x00007f2c006934c0 in start_thread () from /lib64/libpthread.so.0
#5  0x00007f2c005c1163 in clone () from /lib64/libc.so.6
 
Thread 2 (Thread 0x7f2c004a5700 (LWP 64528)):
#0  0x00007f2c004fdc22 in sigtimedwait () from /lib64/libc.so.6
#1  0x00007f2c0069d7ac in sigwait () from /lib64/libpthread.so.0
#2  0x0000555ada4e550d in mongo::(anonymous namespace)::signalProcessingThread(mongo::LogFileStatus) ()
#3  0x0000555adaeb8e20 in execute_native_thread_routine ()
#4  0x00007f2c006934c0 in start_thread () from /lib64/libpthread.so.0
#5  0x00007f2c005c1163 in clone () from /lib64/libc.so.6
 
Thread 1 (Thread 0x7f2c004a6a80 (LWP 64522)):
#0  0x00007f2c006993d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x0000555adaeb5e5c in std::condition_variable::wait(std::unique_lock<std::mutex>&) ()
#2  0x0000555adada51df in mongo::waitForShutdown() ()
#3  0x0000555ad945c025 in mongo::(anonymous namespace)::_initAndListen(int) [clone .isra.326] ()
#4  0x0000555ad953794c in mongo::mongoDbMain(int, char**, char**) ()
#5  0x0000555ad94c38a9 in main ()

Comment by Markus S [ 03/Dec/19 ]

I had the problem occuring on v4.0.13 now as well. Just mention it to let you know about it. If it should happen on an older version too I will mention it again.
If it is any help I can provide you also with the coredump of the process at this state, just tell me.

Thread 25 (Thread 0x7f188652f700 (LWP 60014)):
#0  0x00007f18948329f8 in __pthread_timedjoin_ex () from /lib64/libpthread.so.0
#1  0x00005654c422b5e7 in std::thread::join() ()
#2  0x00005654c396afb1 in mongo::transport::TransportLayerASIO::shutdown() ()
#3  0x00005654c395dc29 in mongo::transport::TransportLayerManager::shutdown() ()
#4  0x00005654c2758556 in mongo::(anonymous namespace)::shutdownTask(mongo::ShutdownTaskArgs const&) ()
#5  0x00005654c4116be5 in mongo::(anonymous namespace)::runTasks(std::stack<std::function<void (mongo::ShutdownTaskArgs const&)>, std::deque<std::function<void (mongo::ShutdownTaskArgs const&)>, std::allocator<std::function<void (mongo::ShutdownTaskArgs const&)> > > >, mongo::ShutdownTaskArgs const&) [clone .constprop.40] ()
#6  0x00005654c26eca5f in mongo::shutdown(mongo::ExitCode, mongo::ShutdownTaskArgs const&) ()
#7  0x00005654c387dc05 in mongo::CmdShutdown::shutdownHelper(mongo::BSONObj const&) ()
#8  0x00005654c2d1a0f0 in mongo::(anonymous namespace)::CmdShutdownMongoD::run(mongo::OperationContext*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, mongo::BSONObj const&, mongo::BSONObjBuilder&) ()
#9  0x00005654c3b80179 in mongo::BasicCommand::Invocation::run(mongo::OperationContext*, mongo::CommandReplyBuilder*) ()
#10 0x00005654c27a0fde in mongo::(anonymous namespace)::execCommandDatabase(mongo::OperationContext*, mongo::Command*, mongo::OpMsgRequest const&, mongo::rpc::ReplyBuilderInterface*, mongo::ServiceEntryPointCommon::Hooks const&) [clone .constprop.314] ()
#11 0x00005654c27a2ed9 in mongo::(anonymous namespace)::receivedCommands(mongo::OperationContext*, mongo::Message const&, mongo::ServiceEntryPointCommon::Hooks const&)::{lambda()#1}::operator()() const ()
#12 0x00005654c27a3e21 in mongo::ServiceEntryPointCommon::handleRequest(mongo::OperationContext*, mongo::Message const&, mongo::ServiceEntryPointCommon::Hooks const&) ()
#13 0x00005654c278f4ba in mongo::ServiceEntryPointMongod::handleRequest(mongo::OperationContext*, mongo::Message const&) ()
#14 0x00005654c279c1ea in mongo::ServiceStateMachine::_processMessage(mongo::ServiceStateMachine::ThreadGuard) ()
#15 0x00005654c2796e97 in mongo::ServiceStateMachine::_runNextInGuard(mongo::ServiceStateMachine::ThreadGuard) ()
#16 0x00005654c279a6b1 in std::_Function_handler<void (), mongo::ServiceStateMachine::_scheduleNextWithGuard(mongo::ServiceStateMachine::ThreadGuard, mongo::transport::ServiceExecutor::ScheduleFlags, mongo::transport::ServiceExecutorTaskName, mongo::ServiceStateMachine::Ownership)::{lambda()#1}>::_M_invoke(std::_Any_data const&) ()
#17 0x00005654c3969942 in mongo::transport::ServiceExecutorSynchronous::schedule(std::function<void ()>, mongo::transport::ServiceExecutor::ScheduleFlags, mongo::transport::ServiceExecutorTaskName) ()
#18 0x00005654c2795080 in mongo::ServiceStateMachine::_scheduleNextWithGuard(mongo::ServiceStateMachine::ThreadGuard, mongo::transport::ServiceExecutor::ScheduleFlags, mongo::transport::ServiceExecutorTaskName, mongo::ServiceStateMachine::Ownership) ()
#19 0x00005654c27981c5 in mongo::ServiceStateMachine::_sourceCallback(mongo::Status) ()
#20 0x00005654c27965d7 in mongo::ServiceStateMachine::_sourceMessage(mongo::ServiceStateMachine::ThreadGuard) ()
#21 0x00005654c2796f1d in mongo::ServiceStateMachine::_runNextInGuard(mongo::ServiceStateMachine::ThreadGuard) ()
#22 0x00005654c279a6b1 in std::_Function_handler<void (), mongo::ServiceStateMachine::_scheduleNextWithGuard(mongo::ServiceStateMachine::ThreadGuard, mongo::transport::ServiceExecutor::ScheduleFlags, mongo::transport::ServiceExecutorTaskName, mongo::ServiceStateMachine::Ownership)::{lambda()#1}>::_M_invoke(std::_Any_data const&) ()
#23 0x00005654c3969ea5 in std::_Function_handler<void (), mongo::transport::ServiceExecutorSynchronous::schedule(std::function<void ()>, mongo::transport::ServiceExecutor::ScheduleFlags, mongo::transport::ServiceExecutorTaskName)::{lambda()#1}>::_M_invoke(std::_Any_data const&) ()
#24 0x00005654c4073024 in mongo::(anonymous namespace)::runFunc(void*) ()
#25 0x00007f18948314c0 in start_thread () from /lib64/libpthread.so.0
#26 0x00007f189475f553 in clone () from /lib64/libc.so.6
 
Thread 24 (Thread 0x7f1886d30700 (LWP 59986)):
#0  0x00007f189475f88e in epoll_wait () from /lib64/libc.so.6
#1  0x00005654c3b4fffe in asio::detail::epoll_reactor::run(long, asio::detail::op_queue<asio::detail::scheduler_operation>&) ()
#2  0x00005654c3b516ae in asio::detail::scheduler::do_run_one(asio::detail::conditionally_enabled_mutex::scoped_lock&, asio::detail::scheduler_thread_info&, std::error_code const&) ()
#3  0x00005654c3b51c01 in asio::detail::scheduler::run(std::error_code&) ()
#4  0x00005654c3b5bc5e in asio::io_context::run() ()
#5  0x00005654c396b9be in std::thread::_Impl<std::_Bind_simple<mongo::transport::TransportLayerASIO::start()::{lambda()#1} ()> >::_M_run() ()
#6  0x00005654c422b6a0 in execute_native_thread_routine ()
#7  0x00007f18948314c0 in start_thread () from /lib64/libpthread.so.0
#8  0x00007f189475f553 in clone () from /lib64/libc.so.6
 
Thread 23 (Thread 0x7f1888533700 (LWP 59983)):
#0  0x00007f189483777a in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00005654c4072ca8 in mongo::ClockSource::waitForConditionUntil(std::condition_variable&, std::unique_lock<std::mutex>&, mongo::Date_t) ()
#2  0x00005654c2b7a2fa in std::thread::_Impl<std::_Bind_simple<mongo::PeriodicRunnerImpl::PeriodicJobImpl::_run()::{lambda()#1} ()> >::_M_run() ()
#3  0x00005654c422b6a0 in execute_native_thread_routine ()
#4  0x00007f18948314c0 in start_thread () from /lib64/libpthread.so.0
#5  0x00007f189475f553 in clone () from /lib64/libc.so.6
 
Thread 22 (Thread 0x7f1888d34700 (LWP 59982)):
#0  0x00007f18948373c5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00005654c42286dc in std::condition_variable::wait(std::unique_lock<std::mutex>&) ()
#2  0x00005654c3abe1e5 in std::thread::_Impl<std::_Bind_simple<mongo::SessionKiller::SessionKiller(mongo::ServiceContext*, std::function<mongo::StatusWith<std::vector<mongo::HostAndPort, std::allocator<mongo::HostAndPort> > > (mongo::OperationContext*, mongo::SessionKiller::Matcher const&, std::linear_congruential_engine<unsigned long, 48271ul, 0ul, 2147483647ul>*)>)::{lambda()#1} ()> >::_M_run() ()
#3  0x00005654c422b6a0 in execute_native_thread_routine ()
#4  0x00007f18948314c0 in start_thread () from /lib64/libpthread.so.0
#5  0x00007f189475f553 in clone () from /lib64/libc.so.6
 
Thread 21 (Thread 0x7f1889535700 (LWP 59981)):
#0  0x00007f189483777a in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00005654c3b6e9e4 in mongo::(anonymous namespace)::PeriodicTaskRunner::run() ()
#2  0x00005654c3b6e0e1 in mongo::BackgroundJob::jobBody() ()
#3  0x00005654c422b6a0 in execute_native_thread_routine ()
--Type <RET> for more, q to quit, c to continue without paging--c
#4  0x00007f18948314c0 in start_thread () from /lib64/libpthread.so.0
#5  0x00007f189475f553 in clone () from /lib64/libc.so.6
 
Thread 20 (Thread 0x7f188ad38700 (LWP 59978)):
#0  0x00007f189483777a in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00005654c2ca3bfb in mongo::FreeMonMessageQueue::dequeue(mongo::ClockSource*) ()
#2  0x00005654c2c9f962 in mongo::FreeMonProcessor::run() ()
#3  0x00005654c422b6a0 in execute_native_thread_routine ()
#4  0x00007f18948314c0 in start_thread () from /lib64/libpthread.so.0
#5  0x00007f189475f553 in clone () from /lib64/libc.so.6
 
Thread 19 (Thread 0x7f188b539700 (LWP 59974)):
#0  0x00007f18948373c5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00005654c42286dc in std::condition_variable::wait(std::unique_lock<std::mutex>&) ()
#2  0x00005654c38c4c60 in mongo::ThreadPool::_consumeTasks() ()
#3  0x00005654c38c5396 in mongo::ThreadPool::_workerThreadBody(mongo::ThreadPool*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) ()
#4  0x00005654c422b6a0 in execute_native_thread_routine ()
#5  0x00007f18948314c0 in start_thread () from /lib64/libpthread.so.0
#6  0x00007f189475f553 in clone () from /lib64/libc.so.6
 
Thread 18 (Thread 0x7f188bd3a700 (LWP 59973)):
#0  0x00007f18948373c5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00005654c3b5186b in asio::detail::scheduler::do_run_one(asio::detail::conditionally_enabled_mutex::scoped_lock&, asio::detail::scheduler_thread_info&, std::error_code const&) ()
#2  0x00005654c3b51c01 in asio::detail::scheduler::run(std::error_code&) ()
#3  0x00005654c3b5bc5e in asio::io_context::run() ()
#4  0x00005654c397772d in mongo::transport::TransportLayerASIO::ASIOReactor::run() ()
#5  0x00005654c3951738 in mongo::executor::NetworkInterfaceTL::_run() ()
#6  0x00005654c422b6a0 in execute_native_thread_routine ()
#7  0x00007f18948314c0 in start_thread () from /lib64/libpthread.so.0
#8  0x00007f189475f553 in clone () from /lib64/libc.so.6
 
Thread 17 (Thread 0x7f188c53b700 (LWP 59972)):
#0  0x00007f189483777a in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00005654c2cc3959 in mongo::FTDCController::doLoop() ()
#2  0x00005654c422b6a0 in execute_native_thread_routine ()
#3  0x00007f18948314c0 in start_thread () from /lib64/libpthread.so.0
#4  0x00007f189475f553 in clone () from /lib64/libc.so.6
 
Thread 16 (Thread 0x7f188cd3c700 (LWP 59971)):
#0  0x00007f18948373c5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00005654c3b5186b in asio::detail::scheduler::do_run_one(asio::detail::conditionally_enabled_mutex::scoped_lock&, asio::detail::scheduler_thread_info&, std::error_code const&) ()
#2  0x00005654c3b51c01 in asio::detail::scheduler::run(std::error_code&) ()
#3  0x00005654c3b5bc5e in asio::io_context::run() ()
#4  0x00005654c397772d in mongo::transport::TransportLayerASIO::ASIOReactor::run() ()
#5  0x00005654c3951738 in mongo::executor::NetworkInterfaceTL::_run() ()
#6  0x00005654c422b6a0 in execute_native_thread_routine ()
#7  0x00007f18948314c0 in start_thread () from /lib64/libpthread.so.0
#8  0x00007f189475f553 in clone () from /lib64/libc.so.6
 
Thread 15 (Thread 0x7f188d53d700 (LWP 59786)):
#0  0x00007f189483777a in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00005654c306544f in mongo::DeadlineMonitor<mongo::mozjs::MozJSImplScope>::deadlineMonitorThread() ()
#2  0x00005654c422b6a0 in execute_native_thread_routine ()
#3  0x00007f18948314c0 in start_thread () from /lib64/libpthread.so.0
#4  0x00007f189475f553 in clone () from /lib64/libc.so.6
 
Thread 14 (Thread 0x7f188dd3e700 (LWP 59706)):
#0  0x00007f189483777a in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00005654c27cabbd in mongo::WiredTigerKVEngine::WiredTigerCheckpointThread::run() ()
#2  0x00005654c3b6e0e1 in mongo::BackgroundJob::jobBody() ()
#3  0x00005654c422b6a0 in execute_native_thread_routine ()
#4  0x00007f18948314c0 in start_thread () from /lib64/libpthread.so.0
#5  0x00007f189475f553 in clone () from /lib64/libc.so.6
 
Thread 13 (Thread 0x7f188e53f700 (LWP 59705)):
#0  0x00007f189483b0a5 in nanosleep () from /lib64/libpthread.so.0
#1  0x00005654c41242f3 in mongo::sleepmillis(long long) ()
#2  0x00005654c27c9de2 in mongo::WiredTigerKVEngine::WiredTigerJournalFlusher::run() ()
#3  0x00005654c3b6e0e1 in mongo::BackgroundJob::jobBody() ()
#4  0x00005654c422b6a0 in execute_native_thread_routine ()
#5  0x00007f18948314c0 in start_thread () from /lib64/libpthread.so.0
#6  0x00007f189475f553 in clone () from /lib64/libc.so.6
 
Thread 12 (Thread 0x7f188ed40700 (LWP 59704)):
#0  0x00007f189483777a in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00005654c27ca560 in mongo::WiredTigerKVEngine::WiredTigerSessionSweeper::run() ()
#2  0x00005654c3b6e0e1 in mongo::BackgroundJob::jobBody() ()
#3  0x00005654c422b6a0 in execute_native_thread_routine ()
#4  0x00007f18948314c0 in start_thread () from /lib64/libpthread.so.0
#5  0x00007f189475f553 in clone () from /lib64/libc.so.6
 
Thread 11 (Thread 0x7f188f541700 (LWP 59703)):
#0  0x00007f189483772c in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00005654c282c1f9 in __wt_cond_wait_signal ()
#2  0x00005654c2806bfe in __sweep_server ()
#3  0x00007f18948314c0 in start_thread () from /lib64/libpthread.so.0
#4  0x00007f189475f553 in clone () from /lib64/libc.so.6
 
Thread 10 (Thread 0x7f188fd42700 (LWP 59702)):
#0  0x00007f189483772c in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00005654c282c1f9 in __wt_cond_wait_signal ()
#2  0x00005654c285744c in __wt_cond_auto_wait_signal ()
#3  0x00005654c2857493 in __wt_cond_auto_wait ()
#4  0x00005654c281616c in __wt_evict_thread_run ()
#5  0x00005654c2864019 in __thread_run ()
#6  0x00007f18948314c0 in start_thread () from /lib64/libpthread.so.0
#7  0x00007f189475f553 in clone () from /lib64/libc.so.6
 
Thread 9 (Thread 0x7f1890543700 (LWP 59701)):
#0  0x00007f189483772c in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00005654c282c1f9 in __wt_cond_wait_signal ()
#2  0x00005654c285744c in __wt_cond_auto_wait_signal ()
#3  0x00005654c2857493 in __wt_cond_auto_wait ()
#4  0x00005654c281616c in __wt_evict_thread_run ()
#5  0x00005654c2864019 in __thread_run ()
#6  0x00007f18948314c0 in start_thread () from /lib64/libpthread.so.0
#7  0x00007f189475f553 in clone () from /lib64/libc.so.6
 
Thread 8 (Thread 0x7f1890d44700 (LWP 59700)):
#0  0x00007f189483772c in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00005654c282c1f9 in __wt_cond_wait_signal ()
#2  0x00005654c285744c in __wt_cond_auto_wait_signal ()
#3  0x00005654c2857493 in __wt_cond_auto_wait ()
#4  0x00005654c281616c in __wt_evict_thread_run ()
#5  0x00005654c2864019 in __thread_run ()
#6  0x00007f18948314c0 in start_thread () from /lib64/libpthread.so.0
#7  0x00007f189475f553 in clone () from /lib64/libc.so.6
 
Thread 7 (Thread 0x7f1891545700 (LWP 59699)):
#0  0x00007f189483772c in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00005654c282c1f9 in __wt_cond_wait_signal ()
#2  0x00005654c285744c in __wt_cond_auto_wait_signal ()
#3  0x00005654c2857493 in __wt_cond_auto_wait ()
#4  0x00005654c281616c in __wt_evict_thread_run ()
#5  0x00005654c2864019 in __thread_run ()
#6  0x00007f18948314c0 in start_thread () from /lib64/libpthread.so.0
#7  0x00007f189475f553 in clone () from /lib64/libc.so.6
 
Thread 6 (Thread 0x7f1891d46700 (LWP 59697)):
#0  0x00007f189483772c in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00005654c282c1f9 in __wt_cond_wait_signal ()
#2  0x00005654c285744c in __wt_cond_auto_wait_signal ()
#3  0x00005654c28c3797 in __log_server ()
#4  0x00007f18948314c0 in start_thread () from /lib64/libpthread.so.0
#5  0x00007f189475f553 in clone () from /lib64/libc.so.6
 
Thread 5 (Thread 0x7f1892547700 (LWP 59696)):
#0  0x00007f189483772c in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00005654c282c1f9 in __wt_cond_wait_signal ()
#2  0x00005654c285744c in __wt_cond_auto_wait_signal ()
#3  0x00005654c2857493 in __wt_cond_auto_wait ()
#4  0x00005654c28c4df3 in __log_wrlsn_server ()
#5  0x00007f18948314c0 in start_thread () from /lib64/libpthread.so.0
#6  0x00007f189475f553 in clone () from /lib64/libc.so.6
 
Thread 4 (Thread 0x7f1892d48700 (LWP 59695)):
#0  0x00007f189483772c in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00005654c282c1f9 in __wt_cond_wait_signal ()
#2  0x00005654c28c3e3b in __log_file_server ()
#3  0x00007f18948314c0 in start_thread () from /lib64/libpthread.so.0
#4  0x00007f189475f553 in clone () from /lib64/libc.so.6
 
Thread 3 (Thread 0x7f1893549700 (LWP 59554)):
#0  0x00007f18948373c5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00005654c42286dc in std::condition_variable::wait(std::unique_lock<std::mutex>&) ()
#2  0x00005654c407268e in std::thread::_Impl<std::_Bind_simple<mongo::BackgroundThreadClockSource::_startTimerThread()::{lambda()#1} ()> >::_M_run() ()
#3  0x00005654c422b6a0 in execute_native_thread_routine ()
#4  0x00007f18948314c0 in start_thread () from /lib64/libpthread.so.0
#5  0x00007f189475f553 in clone () from /lib64/libc.so.6
 
Thread 2 (Thread 0x7f1893d4a700 (LWP 59553)):
#0  0x00007f189469bc22 in sigtimedwait () from /lib64/libc.so.6
#1  0x00007f189483b79c in sigwait () from /lib64/libpthread.so.0
#2  0x00005654c388d49d in mongo::(anonymous namespace)::signalProcessingThread(mongo::LogFileStatus) ()
#3  0x00005654c422b6a0 in execute_native_thread_routine ()
#4  0x00007f18948314c0 in start_thread () from /lib64/libpthread.so.0
#5  0x00007f189475f553 in clone () from /lib64/libc.so.6
 
Thread 1 (Thread 0x7f1893d4c000 (LWP 59548)):
#0  0x00007f18948373c5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00005654c42286dc in std::condition_variable::wait(std::unique_lock<std::mutex>&) ()
#2  0x00005654c4116d1f in mongo::waitForShutdown() ()
#3  0x00005654c275b6c9 in mongo::(anonymous namespace)::_initAndListen(int) [clone .isra.473] ()
#4  0x00005654c275d6cf in mongo::mongoDbMain(int, char**, char**) ()
#5  0x00005654c26ed9b9 in main ()

 

Comment by Markus S [ 02/Dec/19 ]

Thank you so much!

I deployed the binaries at the effected system, if the problem should reappear I will reply immediately.

P.S. Also no worries about the delay, I was on holidays last week myself.

Comment by Benjamin Caimano (Inactive) [ 22/Nov/19 ]

markus_schoder@gmx.at, after some discussion with mira.carey@mongodb.com, we think this might be an issue with atomic instruction reordering in our shutdown procedure. (We did update our toolchain for v4.2 and you may have a newer version of glibc for Fedora 30 than RHEL 8.0.)

I am personally unable to reproduce your issue but I've prepared a patched version of the r4.2.1 RHEL 8.0 mongod that you can try here. Please keep in mind that this isn't an official release and is very much a "use at your own risk" scenario. This binary did pass our CI suite but our release team did not verify it.

In any case, we'll try to get the code changes into v4.2. A more predictable networking shutdown will always be a good idea.

P.S. Apologies for the delay, it was thanksgiving week in the US.

Comment by Markus S [ 22/Nov/19 ]

Hello Benjamin!

We were using before v3.4.10. We upgraded using v3.6.14 and v4.0.13. I cant say on which versions the problems starts because we just used everything in between for upgrading the db's.

The problem occurs on our QA-machine, it copies many db's and runs them with our software to make sure everything is ok. Most of those db's are still from v3.4 so the upgrade procedures using v3.6 + v4.0 runs on them everytime to get it to work with v4.2. If I notice a stuck one in lower versions I will tell you.

All connections to mongodb are localhost. Our software uses the mongo-cxx-driver to connect to mongodb and we use the "MongoDB shell" to shutdown mongodb once we are done.

Comment by Benjamin Caimano (Inactive) [ 21/Nov/19 ]

Hey markus_schoder@gmx.at, thanks for submitting a ticket.

After upgrading to v4.2.1 we started to get mongod processes hanging after receiving the shutdown command.

I am curious what version you were running before. The stacks you have shared show parts of our system that haven't changed substantially since v4.0.

Also, while I have you, can you tell me anything about your activity pattern? I see that it was hanging on shutdown after you established your third connection. It looks like you're using unix domain sockets as well. Are you listening on any web interfaces or just domain sockets?

edit: Apologies, I do see that you are using localhost for your connections. I think the domain socket is our default created one.

Generated at Thu Feb 08 05:06:25 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.