[SERVER-78643] mongos aborts with InterruptedAtShutdown exception Created: 03/Jul/23  Updated: 12/Dec/23

Status: Backlog
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Adi Zaimi Assignee: Backlog - Cluster Scalability
Resolution: Unresolved Votes: 0
Labels: Sharding-NYC, sharding-nyc-subteam2
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
Assigned Teams:
Cluster Scalability
Operating System: ALL
Participants:
Story Points: 2

 Description   

In this patch build
mongos aborts with this stack trace:
(gdb) ba
#0 0x00007f6f5dd09c1f in raise () from /lib64/libpthread.so.0
#1 0x00007f6f60a27b5e in mongo::(anonymous namespace)::endProcessWithSignal (signalNum=6) at src/mongo/util/signal_handlers_synchronous.cpp:120
#2 mongo::(anonymous namespace)::myTerminate () at src/mongo/util/signal_handlers_synchronous.cpp:262
#3 0x00007f6f602107fa in _cxxabiv1::_terminate(void ()) () from /data/debug/lib/libfmt.so
#4 0x00007f6f60210865 in std::terminate() () from /data/debug/lib/libfmt.so
#5 0x00007f6f5aa6439b in __clang_call_terminate () from /data/debug/lib/libperiodic_runner_impl.so
#6 0x00007f6f5aa60bdc in mongo::stdx::thread::thread<mongo::PeriodicRunnerImpl::PeriodicJobImpl::_run()::$_0, , 0>(mongo::PeriodicRunnerImpl::PeriodicJobImpl::_run()::$_0)::{lambda()#1}::operator()() (this=<optimized out>) at src/mongo/stdx/thread.h:193
#7 std::_invoke_impl<void, mongo::stdx::thread::thread<mongo::PeriodicRunnerImpl::PeriodicJobImpl::_run()::$_0, , 0>(mongo::PeriodicRunnerImpl::PeriodicJobImpl::_run()::$_0)::{lambda()#1}>(std::invoke_other, mongo::stdx::thread::thread<mongo::PeriodicRunnerImpl::PeriodicJobImpl::_run()::$_0, , 0>(mongo::PeriodicRunnerImpl::PeriodicJobImpl::_run()::$_0)::{lambda()#1}&&) (_f=...) at /opt/mongodbtoolchain/revisions/69f4f0673ffcb290ce2307560a4883ecf2ad138c/stow/gcc-v4.35T/lib/gcc/x86_64-mongodb-linux/11.3.0/../../../../include/c++/11.3.0/bits/invoke.h:61
#8 0x00007f6f5f18e644 in execute_native_thread_routine () from /data/debug/lib/libabsl_base.so
#9 0x00007f6f5dcff2de in start_thread () from /lib64/libpthread.so.0
#10 0x00007f6f5d817a63 in clone () from /lib64/libc.so.6
 

Max suggests the following: "I would say the InterruptedAtShutdown exception likely came from https://github.com/mongodb/mongo/blob/7060765f3e0d33b74b2f6c7764fb3e88ece3b805/src/mongo/idl/cluster_server_parameter_refresher.cpp#L289 where the SharedLock acquisition in https://github.com/mongodb/mongo/blob/7060765f3e0d33b74b2f6c7764fb3e88ece3b805/src/mongo/db/query/query_settings_manager.cpp#L167 threw because the server shutdown sequence was started and that causes all OperationContexts to report as being interrupted https://github.com/mongodb/mongo/blob/7060765f3e0d33b74b2f6c7764fb3e88ece3b805/src/mongo/db/operation_context.cpp#L244-L247"

 

Creating this ticket to follow up on issue.



 Comments   
Comment by Adi Zaimi [ 06/Jul/23 ]

Another instance of same error and stack trace

[j7:s] | 2023-07-06T21:46:07.378+00:00 F  CONTROL  6384300 [thread22] "Writing fatal message","attr":
{
    message: "DBException::toString(): InterruptedAtShutdown: interrupted at shutdown\nActual exception type:
mongo::error_details::ExceptionForImpl<(mongo::ErrorCodes::Error)11600,
mongo::ExceptionForCat<(mongo::ErrorCategory)2>,
mongo::ExceptionForCat<(mongo::ErrorCategory)7>,
mongo::ExceptionForCat<(mongo::ErrorCategory)8>,
mongo::ExceptionForCat<(mongo::ErrorCategory)14> >\n\n"
}

resulting in 
"mongo::(anonymous namespace)::myTerminate()"
can be found here:
https://parsley.mongodb.com/resmoke/b7fd16fcc95e2bb904e4ae4051fd744f/test/176f657bb83341a80d56618b63398db2?bookmarks=0,96,1179&shareLine=96

Comment by Adi Zaimi [ 03/Jul/23 ]

FTR the `InterruptedAtShutdown: interrupted at shutdown` trace can be seen here in mongos output log

Generated at Thu Feb 08 06:38:53 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.