[SERVER-34854] Backtrace during SSL downgrade of a sharded cluster Created: 04/May/18  Updated: 11/May/18  Resolved: 08/May/18

Status: Closed
Project: Core Server
Component/s: Security
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Timothy Olsen (Inactive) Assignee: Mira Carey
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File cfg1.2018-05-04T09-15-04.log.gz     File cfg1.2018-05-04T09-15-52.log.gz     File cfg1.log.gz     File cfg2.2018-05-04T09-10-03.log.gz     File cfg2.2018-05-04T09-16-26.log.gz     File cfg2.log.gz     File cfg3.2018-05-04T09-13-11.log.gz     File cfg3.2018-05-04T09-17-24.log.gz     File mongos1.2018-05-04T09-10-33.log.gz     File mongos1.2018-05-04T09-15-49.log.gz     File mongos1.log.gz     File mongos2.2018-05-04T09-13-47.log.gz     File mongos2.2018-05-04T09-17-03.log.gz     File mongos2.log.gz     File shardA-1.2018-05-04T09-13-00.log.gz     File shardA-1.2018-05-04T09-57-27.log.gz     File shardA-2.2018-05-04T09-11-38.log.gz     File shardA-2.2018-05-04T09-16-25.log.gz     File shardA-2.2018-05-04T09-57-27.log.gz     File shardA-3.2018-05-04T09-10-02.log.gz     File shardA-3.2018-05-04T09-15-52.log.gz     File shardA-3.2018-05-04T09-57-27.log.gz     File shardB-1.2018-05-04T09-10-03.log.gz     File shardB-1.2018-05-04T09-16-27.log.gz     File shardB-1.2018-05-04T09-57-27.log.gz     File shardB-2.2018-05-04T09-09-33.log.gz     File shardB-2.2018-05-04T09-16-01.log.gz     File shardB-2.2018-05-04T09-17-13.log.gz     File shardB-3.2018-05-04T09-09-17.log.gz     File shardB-3.2018-05-04T09-15-47.log.gz     File shardB-3.2018-05-04T09-16-28.log.gz     File shardB-3.log.gz    
Issue Links:
Depends
Duplicate
duplicates SERVER-34901 Extend lifetime of tlconnections Closed
Related
Operating System: ALL
Steps To Reproduce:
  1. Set up a MongoDB 3.7.9 2-shard PSA cluster with 3 config servers and 2 mongoses with sslMode set to disabled
  2. Restart each member with sslMode set to allowSSL
  3. Run db.adminCommand( { setParameter: 1, sslMode: "preferSSL" }

    ) on each member

  4. Run db.adminCommand( { setParameter: 1, sslMode: "requireSSL" }

    )

  5. Restart each member with sslMode set to preferSSL.  Some of the members will backtrace during this time.
Sprint: Platforms 2018-05-21
Participants:

 Description   

I get backtraces on 4 members of a  MongoDB 3.7.9 sharded cluster while attempting an SSL downgrade from requireSSL to preferSSL.  The downgrade was attempted by restarting MongoDB with sslMode set to preferSSL.

This is a 2-shard cluster with each shard having 3 members, 1 of which is an arbiter.  There are 3 config servers and 2 mongoses.

The cluster first went through an upgrade of SSL from disabled to requireSSL.  This was done by first restarting each member with allowSSL.  Then there was one round of upgrading to preferSSL using setParameter, and then another round of setParameter to get to requireSSL.

Then after that the downgrade was attempted (via restarting, not setParameter).

I am attaching log files for all cluster members.  The following log files show backtraces:

mongos2.2018-05-04T09-17-03.log
shardA-2.2018-05-04T09-16-25.log
shardB-1.2018-05-04T09-16-27.log
shardB-3.2018-05-04T09-16-28.log

 



 Comments   
Comment by Mira Carey [ 08/May/18 ]

I've managed to run this one down.  The underlying bug involves a race between

  1. a timer which tells us we've timed out trying to get a connection in the connection pool
  2. the setup for new connections in the pool (connecting, authing, isMastering, etc)

If the timeout fires while the first phase is running, we fall off the end and write/read uninitialized memory.

I've filed SERVER-34901 for the specific fix

Comment by Timothy Olsen (Inactive) [ 08/May/18 ]

The cluster auth upgrade included SSL that had already been fully transitioned but the auth transitioning did not included any SSL.

Comment by Mira Carey [ 07/May/18 ]

tim.olsen, When you said that cluster auth upgrades and auth transitioning also crashed, did those also include ssl? (non-transitioning ssl?)

Comment by Timothy Olsen (Inactive) [ 07/May/18 ]

I'm also seeing backtraces on SSL upgrades, as well as cluster auth upgrades and auth transitioning. Sounds like they might all be related? Again, only OS X.

Comment by Kaloian Manassiev [ 04/May/18 ]

Demangled stack trace:

mongos(mongo::printStackTrace(std::__1::basic_ostream<char, std::__1::char_traits<char> >&) 0x39) [0x108d330e9]
 mongos(mongo::(anonymous namespace)::abruptQuitWithAddrSignal(int, __siginfo*, void*) 0x12A) [0x108d32b7a]
 libsystem_platform.dylib(_sigtramp 0x1A) [0x7fffc2c3bb3a]
 ??? [0x10a395400]
 mongos(_ZNSt3__110__function6__funcIZN5mongo14future_details6FutureINS2_7MessageEE16makeContinuationIvZZNOS6_4thenIZNS2_13AsyncDBClient15initWireVersionERKNS_12basic_stringIcNS_11char_traitsIcEENS_9allocatorIcEEEEPNS2_8executor21NetworkConnectionHookEE3$_4vvEENS4_IT0_EEOT_ENKUlvE_clEvEUlPNS3_15SharedStateImplIS5_EEPNSR_INS3_8FakeVoidEEEE_EENS4_ISO_EEOSM_EUlPNS3_15SharedStateBaseEE_NSD_IS12_EEFvS11_EEclEOS11_ 0x6A) [0x1087a9c2a]
 mongos(mongo::future_details::SharedStateBase::transitionToFinished() 0x3F) [0x1082f6a5f]
 mongos(_ZNSt3__110__function6__funcIZN5mongo14future_details6FutureINS2_7MessageEE16makeContinuationIS5_ZZNOS6_4thenIZNS2_13AsyncDBClient5_callES5_E3$_6S5_vEENS4_IT0_EEOT_ENKUlvE_clEvEUlPNS3_15SharedStateImplIS5_EESI_E_EENS4_ISD_EEOSB_EUlPNS3_15SharedStateBaseEE_NS_9allocatorISO_EEFvSN_EEclEOSN_ 0x75) [0x1087aa025]
 mongos(mongo::future_details::SharedStateBase::transitionToFinished() 0x3F) [0x1082f6a5f]
 mongos(mongo::future_details::SharedStateBase::transitionToFinished() 0x3F) [0x1082f6a5f]
 mongos(mongo::future_details::Future<mongo::transport::TransportLayerASIO::ASIOSession::sourceMessageImpl()::{lambda(unsigned long)#1}> mongo::future_details::Future<unsigned long>::then<mongo::transport::TransportLayerASIO::ASIOSession::sourceMessageImpl()::{lambda(unsigned long)#1}, mongo::future_details::Future<mongo::Message>, void, mongo::transport::TransportLayerASIO::ASIOSession::sourceMessageImpl()::{lambda(unsigned long)#1}>(mongo::transport::TransportLayerASIO::ASIOSession::sourceMessageImpl()::{lambda(unsigned long)#1}&&) &&::{lambda()#1}::operator()() const::{lambda(mongo::future_details::SharedStateImpl<unsigned long>*, {lambda()#1}<mongo::transport::TransportLayerASIO::ASIOSession::sourceMessageImpl()::{lambda(unsigned long)#1}>*)#1}::operator()(mongo::future_details::SharedStateImpl, mongo::future_details::SharedStateImpl<unsigned long>*) 0xCB) [0x108867c8b]
 mongos(mongo::future_details::SharedStateBase::transitionToFinished() 0x3F) [0x1082f6a5f]
 mongos(mongo::future_details::Future<unsigned long> mongo::future_details::Future<unsigned long>::then<mongo::future_details::Future<unsigned long> mongo::transport::TransportLayerASIO::ASIOSession::opportunisticRead<asio::basic_stream_socket<asio::generic::stream_protocol>, asio::mutable_buffers_1>(asio::basic_stream_socket<asio::generic::stream_protocol>&, asio::mutable_buffers_1 const&)::{lambda(unsigned long)#1}, unsigned long, void>(asio::basic_stream_socket<asio::generic::stream_protocol>&&) &&::{lambda()#1}::operator()() const::{lambda(mongo::future_details::SharedStateImpl<unsigned long>*, mongo::future_details::SharedStateImpl)#1}::operator()(mongo::future_details::SharedStateImpl, mongo::future_details::SharedStateImpl) 0x6A) [0x10885a3da]
 mongos(mongo::future_details::SharedStateBase::transitionToFinished() 0x3F) [0x1082f6a5f]
 mongos(void mongo::transport::use_future_details::AsyncHandlerHelper<std::__1::error_code, unsigned long>::complete<unsigned long>(mongo::SharedPromise<unsigned long>*, std::__1::error_code, unsigned long&&) 0x101) [0x1088599c1]
 mongos(asio::detail::read_op<asio::ssl::stream<asio::basic_stream_socket<asio::generic::stream_protocol> >, asio::mutable_buffers_1, asio::mutable_buffer const*, asio::detail::transfer_all_t, mongo::transport::use_future_details::AsyncHandler<std::__1::error_code, unsigned long> >::operator()(std::__1::error_code const&, unsigned long, int) 0xB9) [0x108858639]
 mongos(asio::ssl::detail::io_op<asio::basic_stream_socket<asio::generic::stream_protocol>, asio::ssl::detail::read_op<asio::mutable_buffers_1>, asio::detail::read_op<asio::ssl::stream<asio::basic_stream_socket<asio::generic::stream_protocol> >, asio::mutable_buffers_1, asio::mutable_buffer const*, asio::detail::transfer_all_t, mongo::transport::use_future_details::AsyncHandler<std::__1::error_code, unsigned long> > >::operator()(std::__1::error_code, unsigned long, int) 0x350) [0x108858ab0]
 mongos(asio::detail::reactive_socket_recv_op<asio::mutable_buffers_1, asio::ssl::detail::io_op<asio::basic_stream_socket<asio::generic::stream_protocol>, asio::ssl::detail::read_op<asio::mutable_buffers_1>, asio::detail::read_op<asio::ssl::stream<asio::basic_stream_socket<asio::generic::stream_protocol> >, asio::mutable_buffers_1, asio::mutable_buffer const*, asio::detail::transfer_all_t, mongo::transport::use_future_details::AsyncHandler<std::__1::error_code, unsigned long> > > >::do_complete(void*, asio::detail::scheduler_operation*, std::__1::error_code const&, unsigned long) 0x111) [0x108858ea1]
 mongos(asio::detail::scheduler::do_run_one(asio::detail::conditionally_enabled_mutex::scoped_lock&, asio::detail::scheduler_thread_info&, std::__1::error_code const&) 0x246) [0x108c981f6]
 mongos(asio::detail::scheduler::run(std::__1::error_code&) 0xD9) [0x108c8e929]
 mongos(asio::io_context::run() 0x39) [0x108c8e7b9]
 mongos(mongo::transport::TransportLayerASIO::ASIOReactor::run() 0x4B) [0x10886d7bb]
 mongos(_ZNSt3__114__thread_proxyINS_5tupleIJZN5mongo8executor18NetworkInterfaceTL7startupEvE3$_3EEEEEPvS7_ 0x11B) [0x10879571b]

Comment by Timothy Olsen (Inactive) [ 04/May/18 ]

I forgot to mention that I've only seen this happen on OS X.

Generated at Thu Feb 08 04:38:04 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.