[SERVER-32446] Mongod crashes when config servers upgraded to RS Created: 21/Dec/17  Updated: 30/Oct/23  Resolved: 31/Jan/18

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 3.2.18
Fix Version/s: 3.2.19

Type: Bug Priority: Major - P3
Reporter: Misha Tyulenev Assignee: Misha Tyulenev
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
is duplicated by SERVER-36139 Cluster-wide crash due to segfaults i... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Sharding 2018-01-29, Sharding 2018-02-12
Participants:
Case:

 Description   

Mongod 3.2.18 crashes with segfault during upgrade of config servers (primary running with sccc AND RS). The crash occured while primary config node was stepped down prior to sccc mode being removed.
Here's the parsed stack trace using http://downloads.mongodb.org/linux/mongodb-linux-x86_64-rhel62-debugsymbols-3.2.18.tgz:

 /data/mci/c1844f0cf65f07c4d744b1b1343bf2dd/src/src/mongo/util/stacktrace_posix.cpp:171:0: mongo::printStackTrace(std::ostream&)
 /data/mci/c1844f0cf65f07c4d744b1b1343bf2dd/src/src/mongo/util/signal_handlers_synchronous.cpp:180:0: mongo::(anonymous namespace)::printSignalAndBacktrace(int)
 /data/mci/c1844f0cf65f07c4d744b1b1343bf2dd/src/src/mongo/util/signal_handlers_synchronous.cpp:276:0: mongo::(anonymous namespace)::abruptQuitWithAddrSignal(int, siginfo*, void*)
 ??:0:0: ??
 ??:0:0: ??
 /opt/mongodbtoolchain/v1/include/c++/4.8.2/bits/char_traits.h:271:0: std::char_traits<char>::copy(char*, char const*, unsigned long)
 /opt/mongodbtoolchain/v1/include/c++/4.8.2/bits/basic_string.h:359:0: std::string::_M_copy(char*, char const*, unsigned long)
 /opt/mongodbtoolchain/v1/include/c++/4.8.2/bits/basic_string.h:404:0: std::string::_S_copy_chars(char*, char const*, char const*)
 /opt/mongodbtoolchain/v1/include/c++/4.8.2/bits/basic_string.tcc:140:0: char* std::string::_S_construct<char const*>(char const*, char const*, std::allocator<char> const&, std::forward_iterator_tag)
 /data/mci/f5d89cb468d530e4577e158d8bd78d5e/toolchain-builder/build-gcc-v1.sh-vDX/x86_64-mongodb-linux/libstdc++-v3/include/bits/basic_string.h:1725:0: _S_construct_aux<char const*>
 /data/mci/f5d89cb468d530e4577e158d8bd78d5e/toolchain-builder/build-gcc-v1.sh-vDX/x86_64-mongodb-linux/libstdc++-v3/include/bits/basic_string.h:1746:0: _S_construct<char const*>
 /data/mci/f5d89cb468d530e4577e158d8bd78d5e/toolchain-builder/build-gcc-v1.sh-vDX/x86_64-mongodb-linux/libstdc++-v3/include/bits/basic_string.tcc:207:0: std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, unsigned long, std::allocator<char> const&)
 /data/mci/c1844f0cf65f07c4d744b1b1343bf2dd/src/src/mongo/base/string_data.h:177:0: mongo::StringData::toString() const
 /data/mci/c1844f0cf65f07c4d744b1b1343bf2dd/src/src/mongo/client/connection_string.cpp:46:0: mongo::ConnectionString::ConnectionString(mongo::StringData, std::vector<mongo::HostAndPort, std::allocator<mongo::HostAndPort> >)
 /data/mci/c1844f0cf65f07c4d744b1b1343bf2dd/src/src/mongo/client/connection_string.cpp:83:0: mongo::ConnectionString::forReplicaSet(mongo::StringData, std::vector<mongo::HostAndPort, std::allocator<mongo::HostAndPort> >)
 /data/mci/c1844f0cf65f07c4d744b1b1343bf2dd/src/src/mongo/s/client/sharding_network_connection_hook.cpp:83:0: mongo::ShardingNetworkConnectionHook::validateHostImpl(mongo::HostAndPort const&, mongo::executor::RemoteCommandResponse const&, bool)
 /data/mci/c1844f0cf65f07c4d744b1b1343bf2dd/src/src/mongo/s/sharding_initialization.cpp:173:0: operator()
 /opt/mongodbtoolchain/v1/include/c++/4.8.2/functional:2057:0: std::_Function_handler<mongo::Status (mongo::HostAndPort const&, mongo::executor::RemoteCommandResponse const&), mongo::initializeGlobalShardingState(mongo::OperationContext*, mongo::ConnectionString const&, bool)::'lambda'(mongo::HostAndPort const&, mongo::executor::RemoteCommandResponse const&)>::_M_invoke(std::_Any_data const&, mongo::HostAndPort const&, mongo::executor::RemoteCommandResponse const&)
 /opt/mongodbtoolchain/v1/include/c++/4.8.2/functional:2464:0: std::function<mongo::Status (mongo::HostAndPort const&, mongo::executor::RemoteCommandResponse const&)>::operator()(mongo::HostAndPort const&, mongo::executor::RemoteCommandResponse const&) const
 /data/mci/c1844f0cf65f07c4d744b1b1343bf2dd/src/src/mongo/client/syncclusterconnection.cpp:207:0: operator()
 /opt/mongodbtoolchain/v1/include/c++/4.8.2/functional:2057:0: std::_Function_handler<mongo::Status (mongo::executor::RemoteCommandResponse const&), mongo::SyncClusterConnection::_connect(std::string const&)::'lambda'(mongo::executor::RemoteCommandResponse const&)>::_M_invoke(std::_Any_data const&, mongo::executor::RemoteCommandResponse const&)
 /opt/mongodbtoolchain/v1/include/c++/4.8.2/functional:2464:0: std::function<mongo::Status (mongo::executor::RemoteCommandResponse const&)>::operator()(mongo::executor::RemoteCommandResponse const&) const
 /data/mci/c1844f0cf65f07c4d744b1b1343bf2dd/src/src/mongo/client/dbclient.cpp:950:0: mongo::DBClientConnection::connect(mongo::HostAndPort const&)
 /data/mci/c1844f0cf65f07c4d744b1b1343bf2dd/src/src/mongo/client/syncclusterconnection.cpp:215:0: mongo::SyncClusterConnection::_connect(std::string const&)
 /data/mci/c1844f0cf65f07c4d744b1b1343bf2dd/src/src/mongo/client/syncclusterconnection.cpp:71:0: mongo::SyncClusterConnection::SyncClusterConnection(std::list<mongo::HostAndPort, std::allocator<mongo::HostAndPort> > const&, double)
 /data/mci/c1844f0cf65f07c4d744b1b1343bf2dd/src/src/mongo/client/connection_string_connect.cpp:77:0: mongo::ConnectionString::connect(std::string&, double) const
 /data/mci/c1844f0cf65f07c4d744b1b1343bf2dd/src/src/mongo/client/connpool.cpp:292:0: mongo::DBConnectionPool::get(std::string const&, double)
 /data/mci/c1844f0cf65f07c4d744b1b1343bf2dd/src/src/mongo/client/connpool.cpp:494:0: mongo::ScopedDbConnection::ScopedDbConnection(std::string const&, double)
 /data/mci/c1844f0cf65f07c4d744b1b1343bf2dd/src/src/mongo/s/catalog/legacy/legacy_dist_lock_pinger.cpp:71:0: mongo::LegacyDistLockPinger::_distLockPingThread(mongo::ConnectionString, std::string const&, std::chrono::duration<long, std::ratio<1l, 1000l> >)
 /data/mci/c1844f0cf65f07c4d744b1b1343bf2dd/src/src/mongo/s/catalog/legacy/legacy_dist_lock_pinger.cpp:213:0: mongo::LegacyDistLockPinger::distLockPingThread(mongo::ConnectionString, long long, std::string const&, std::chrono::duration<long, std::ratio<1l, 1000l> >)
 /opt/mongodbtoolchain/v1/include/c++/4.8.2/functional:601:0: operator()<mongo::ConnectionString&, long long int&, std::basic_string<char, std::char_traits<char>, std::allocator<char> >&, std::chrono::duration<long int, std::ratio<1l, 1000l> >&, void>
 /opt/mongodbtoolchain/v1/include/c++/4.8.2/functional:1296:0: __call<void, 0ul, 1ul, 2ul, 3ul, 4ul>
 /opt/mongodbtoolchain/v1/include/c++/4.8.2/functional:1355:0: operator()<, void>
 /opt/mongodbtoolchain/v1/include/c++/4.8.2/functional:1732:0: _M_invoke<>
 /opt/mongodbtoolchain/v1/include/c++/4.8.2/functional:1720:0: std::_Bind_simple<std::_Bind<std::_Mem_fn<void (mongo::LegacyDistLockPinger::*)(mongo::ConnectionString, long long, std::string const&, std::chrono::duration<long, std::ratio<1l, 1000l> >)> (mongo::LegacyDistLockPinger*, mongo::ConnectionString, long long, std::string, std::chrono::duration<long, std::ratio<1l, 1000l> >)> ()>::operator()()
 /opt/mongodbtoolchain/v1/include/c++/4.8.2/thread:115:0: std::thread::_Impl<std::_Bind_simple<std::_Bind<std::_Mem_fn<void (mongo::LegacyDistLockPinger::*)(mongo::ConnectionString, long long, std::string const&, std::chrono::duration<long, std::ratio<1l, 1000l> >)> (mongo::LegacyDistLockPinger*, mongo::ConnectionString, long long, std::string, std::chrono::duration<long, std::ratio<1l, 1000l> >)> ()> >::_M_run()
 /data/mci/f5d89cb468d530e4577e158d8bd78d5e/toolchain-builder/build-gcc-v1.sh-vDX/x86_64-mongodb-linux/libstdc++-v3/src/c++11/../../../../../gcc-4.8.2/libstdc++-v3/src/c++11/thread.cc:84:0: execute_native_thread_routine
 ??:0:0: ??
 ??:0:0: ??

The same root issue as in SERVER-25029 and a similar fix needs to be made here:
https://github.com/mongodb/mongo/blob/r3.2.18/src/mongo/s/client/sharding_network_connection_hook.cpp?utf8=%E2%9C%93#L81-L85

The lines https://github.com/mongodb/mongo/blob/r3.2.18/src/mongo/s/client/sharding_connection_hook.cpp#L185-L187 are missing in the code pointed out.



 Comments   
Comment by Githook User [ 31/Jan/18 ]

Author:

{'email': 'misha@mongodb.com', 'name': 'Misha Tyulenev', 'username': 'mikety'}

Message: SERVER-32446 check if CSRS is initiated when reading isMaster response setName field
Branch: v3.2
https://github.com/mongodb/mongo/commit/45334def42a256b7252cee7762cd16954ad7539f

Comment by Githook User [ 30/Jan/18 ]

Author:

{'email': 'misha@mongodb.com', 'name': 'Misha Tyulenev', 'username': 'mikety'}

Message: Revert "SERVER-32446 check if CSRS is initiated when reading isMaster response setName field"

This reverts commit 6fc36be1614f4ae641fefd2d5085958025bac3b4.
Branch: v3.2
https://github.com/mongodb/mongo/commit/a55aa869145e7e35f6f268b6fa74b6a72240bc77

Comment by Githook User [ 29/Jan/18 ]

Author:

{'email': 'misha@mongodb.com', 'name': 'Misha Tyulenev', 'username': 'mikety'}

Message: SERVER-32446 check if CSRS is initiated when reading isMaster response setName field
Branch: v3.2
https://github.com/mongodb/mongo/commit/6fc36be1614f4ae641fefd2d5085958025bac3b4

Generated at Thu Feb 08 04:30:16 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.