[SERVER-70771] Invariant failure in ConnectionMetrics Created: 21/Oct/22  Updated: 27/Oct/23  Resolved: 02/Nov/22

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Andrew Shuvalov (Inactive) Assignee: Blake Oler
Resolution: Gone away Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
depends on SERVER-69584 Pass ConnectionMetrics by shared_ptr Closed
is depended on by SERVER-70487 Fix the test delete_range_deletion_ta... Closed
Operating System: ALL
Steps To Reproduce:

It can be reproduced only in catalog shard POC with test jstests/sharding/delete_range_deletion_tasks_on_stepup_after_drop_collection.js. I think the general assumption that the raw link to ConnectionMetrics can survive step down followed by step up is broken.

Sprint: Service Arch 2022-11-14
Participants:

 Description   

#3  0x00007f99cafb4d1d in mongo::invariantFailed (expr=0x7f99c888be1d "_stopWatch", file=0x7f99c888bce0 "src/mongo/executor/connection_metrics.h", line=84) at src/mongo/util/assert_util.cpp:143
#4  0x00007f99c88176d6 in mongo::invariantWithLocation<boost::optional<mongo::ClockSource::StopWatch> > (testOK=boost::optional<mongo::ClockSource::StopWatch> is not initialized, expr=0x2 <error: Cannot access memory at address 0x2>, file=<optimized out>, line=84) at src/mongo/util/assert_util_core.h:74
#5  mongo::ConnectionMetrics::onDNSResolved (this=0x5562d318bbf0) at src/mongo/executor/connection_metrics.h:84
#6  mongo::transport::TransportLayerASIO::asyncConnect(mongo::HostAndPort, mongo::transport::ConnectSSLMode, std::shared_ptr<mongo::transport::Reactor> const&, mongo::Duration<std::ratio<1l, 1000l> >, mongo::ConnectionMetrics*, std::shared_ptr<mongo::transport::SSLConnectionContext const>)::$_6::operator()(std::vector<mongo::transport::WrappedEndpoint, std::allocator<mongo::transport::WrappedEndpoint> >) const (this=<optimized out>, results=std::vector of length 1, capacity -533425857672 = {...}) at src/mongo/transport/transport_layer_asio.cpp:817
#7  mongo::future_details::call<mongo::transport::TransportLayerASIO::asyncConnect(mongo::HostAndPort, mongo::transport::ConnectSSLMode, std::shared_ptr<mongo::transport::Reactor> const&, mongo::Duration<std::ratio<1l, 1000l> >, mongo::ConnectionMetrics*, std::shared_ptr<mongo::transport::SSLConnectionContext const>)::$_6&, std::vector<mongo::transport::WrappedEndpoint, std::allocator<mongo::transport::WrappedEndpoint> > >(mongo::transport::TransportLayerASIO::asyncConnect(mongo::HostAndPort, mongo::transport::ConnectSSLMode, std::shared_ptr<mongo::transport::Reactor> const&, mongo::Duration<std::ratio<1l, 1000l> >, mongo::ConnectionMetrics*, std::shared_ptr<mongo::transport::SSLConnectionContext const>)::$_6&, std::vector<mongo::transport::WrappedEndpoint, std::allocator<mongo::transport::WrappedEndpoint> >&&) (func=..., arg=...) at src/mongo/util/future_impl.h:291
#8  mongo::future_details::throwingCall<mongo::transport::TransportLayerASIO::asyncConnect(mongo::HostAndPort, mongo::transport::ConnectSSLMode, std::shared_ptr<mongo::transport::Reactor> const&, mongo::Duration<std::ratio<1l, 1000l> >, mongo::ConnectionMetrics*, std::shared_ptr<mongo::transport::SSLConnectionContext const>)::$_6&, std::vector<mongo::transport::WrappedEndpoint, std::allocator<mongo::transport::WrappedEndpoint> > >(mongo::transport::TransportLayerASIO::asyncConnect(mongo::HostAndPort, mongo::transport::ConnectSSLMode, std::shared_ptr<mongo::transport::Reactor> const&, mongo::Duration<std::ratio<1l, 1000l> >, mongo::ConnectionMetrics*, std::shared_ptr<mongo::transport::SSLConnectionContext const>)::$_6&, std::vector<mongo::transport::WrappedEndpoint, std::allocator<mongo::transport::WrappedEndpoint> >&&) (func=..., args=...) at src/mongo/util/future_impl.h:349
#9  0x00007f99c8817b4d in mongo::future_details::FutureImpl<std::vector<mongo::transport::WrappedEndpoint, std::allocator<mongo::transport::WrappedEndpoint> > >::then<mongo::CleanupFuturePolicy<false>, mongo::transport::TransportLayerASIO::asyncConnect(mongo::HostAndPort, mongo::transport::ConnectSSLMode, std::shared_ptr<mongo::transport::Reactor> const&, mongo::Duration<std::ratio<1l, 1000l> >, mongo::ConnectionMetrics*, std::shared_ptr<mongo::transport::SSLConnectionContext const>)::$_6, 0>(mongo::CleanupFuturePolicy<false>, mongo::transport::TransportLayerASIO::asyncConnect(mongo::HostAndPort, mongo::transport::ConnectSSLMode, std::shared_ptr<mongo::transport::Reactor> const&, mongo::Duration<std::ratio<1l, 1000l> >, mongo::ConnectionMetrics*, std::shared_ptr<mongo::transport::SSLConnectionContext const>)::$_6&&) &&::{lambda()#1}::operator()() const::{lambda(mongo::future_details::SharedStateImpl<std::vector<mongo::transport::WrappedEndpoint, std::allocator<mongo::transport::WrappedEndpoint> > >*, mongo::future_details::SharedStateImpl<mongo::future_details::FakeVoid>*)#1}::operator()(mongo::future_details::SharedStateImpl<std::vector<mongo::transport::WrappedEndpoint, std::allocator<mongo::transport::WrappedEndpoint> > >*, mongo::future_details::SharedStateImpl<mongo::future_details::FakeVoid>*) (this=<optimized out>, input=<optimized out>, output=0x5562d3f14300) at src/mongo/util/future_impl.h:996
#10 mongo::future_details::FutureImpl<std::vector<mongo::transport::WrappedEndpoint, std::allocator<mongo::transport::WrappedEndpoint> > >::makeContinuation<void, mongo::future_details::FutureImpl<std::vector<mongo::transport::WrappedEndpoint, std::allocator<mongo::transport::WrappedEndpoint> > >::then<mongo::CleanupFuturePolicy<false>, mongo::transport::TransportLayerASIO::asyncConnect(mongo::HostAndPort, mongo::transport::ConnectSSLMode, std::shared_ptr<mongo::transport::Reactor> const&, mongo::Duration<std::ratio<1l, 1000l> >, mongo::ConnectionMetrics*, std::shared_ptr<mongo::transport::SSLConnectionContext const>)::$_6, 0>(mongo::CleanupFuturePolicy<false>, mongo::transport::TransportLayerASIO::asyncConnect(mongo::HostAndPort, mongo::transport::ConnectSSLMode, std::shared_ptr<mongo::transport::Reactor> const&, mongo::Duration<std::ratio<1l, 1000l> >, mongo::ConnectionMetrics*, std::shared_ptr<mongo::transport::SSLConnectionContext const>)::$_6&&) &&::{lambda()#1}::operator()() const::{lambda(mongo::future_details::SharedStateImpl<std::vector<mongo::transport::WrappedEndpoint, std::allocator<mongo::transport::WrappedEndpoint> > >*, mongo::future_details::SharedStateImpl<mongo::future_details::FakeVoid>*)#1}>(mongo::transport::TransportLayerASIO::asyncConnect(mongo::HostAndPort, mongo::transport::ConnectSSLMode, std::shared_ptr<mongo::transport::Reactor> const&, mongo::Duration<std::ratio<1l, 1000l> >, mongo::ConnectionMetrics*, std::shared_ptr<mongo::transport::SSLConnectionContext const>)::$_6&&)::{lambda(mongo::future_details::SharedStateBase*)#1}::operator()(mongo::future_details::SharedStateBase*) (this=<optimized out>, ssb=<optimized out>) at src/mongo/util/future_impl.h:1327
...
#29 mongo::Promise<asio::ip::basic_resolver_results<asio::ip::tcp> >::emplaceValue<asio::ip::basic_resolver_results<asio::ip::tcp> const&, 0> (this=0x7f99abc25010, args=...) at src/mongo/util/future.h:976
#30 mongo::transport::UseFuture::Adapter<std::error_code, asio::ip::basic_resolver_results<asio::ip::tcp> >::Handler::_onSuccess<asio::ip::basic_resolver_results<asio::ip::tcp> const&> (this=0x7f99abc25010, args=...) at src/mongo/transport/asio_utils.h:224
#31 mongo::transport::UseFuture::Adapter<std::error_code, asio::ip::basic_resolver_results<asio::ip::tcp> >::Handler::_onInvoke<asio::ip::basic_resolver_results<asio::ip::tcp> const&> (this=0x7f99abc25010, ec=..., args=...) at src/mongo/transport/asio_utils.h:236
#32 0x00007f99c882e8ab in mongo::transport::UseFuture::Adapter<std::error_code, asio::ip::basic_resolver_results<asio::ip::tcp> >::Handler::operator()<std::error_code const&, asio::ip::basic_resolver_results<asio::ip::tcp> const&> (args=..., args=..., this=<optimized out>) at src/mongo/transport/asio_utils.h:246
#33 asio::detail::binder2<mongo::transport::UseFuture::Adapter<std::error_code, asio::ip::basic_resolver_results<asio::ip::tcp> >::Handler, std::error_code, asio::ip::basic_resolver_results<asio::ip::tcp> >::operator() (this=0x0) at src/third_party/asio-master/asio/include/asio/detail/bind_handler.hpp:163
#34 asio::asio_handler_invoke<asio::detail::binder2<mongo::transport::UseFuture::Adapter<std::error_code, asio::ip::basic_resolver_results<asio::ip::tcp> >::Handler, std::error_code, asio::ip::basic_resolver_results<asio::ip::tcp> > > (function=...) at src/third_party/asio-master/asio/include/asio/handler_invoke_hook.hpp:68
#35 asio_handler_invoke_helpers::invoke<asio::detail::binder2<mongo::transport::UseFuture::Adapter<std::error_code, asio::ip::basic_resolver_results<asio::ip::tcp> >::Handler, std::error_code, asio::ip::basic_resolver_results<asio::ip::tcp> >, mongo::transport::UseFuture::Adapter<std::error_code, asio::ip::basic_resolver_results<asio::ip::tcp> >::Handler> (function=..., context=...) at src/third_party/asio-master/asio/include/asio/detail/handler_invoke_helpers.hpp:37
#36 asio::detail::handler_work<mongo::transport::UseFuture::Adapter<std::error_code, asio::ip::basic_resolver_results<asio::ip::tcp> >::Handler, asio::system_executor>::complete<asio::detail::binder2<mongo::transport::UseFuture::Adapter<std::error_code, asio::ip::basic_resolver_results<asio::ip::tcp> >::Handler, std::error_code, asio::ip::basic_resolver_results<asio::ip::tcp> > > (this=<optimized out>, function=..., handler=...) at src/third_party/asio-master/asio/include/asio/detail/handler_work.hpp:81
#37 asio::detail::resolve_query_op<asio::ip::tcp, mongo::transport::UseFuture::Adapter<std::error_code, asio::ip::basic_resolver_results<asio::ip::tcp> >::Handler>::do_complete (owner=0x5562d2888000, base=0x5562d3a26460) at src/third_party/asio-master/asio/include/asio/detail/resolve_query_op.hpp:115
#38 0x00007f99c1cc2387 in asio::detail::scheduler_operation::complete (this=<optimized out>, owner=0x5562d2888000, ec=..., bytes_transferred=<optimized out>) at src/third_party/asio-master/asio/include/asio/detail/scheduler_operation.hpp:39
#39 asio::detail::scheduler::do_run_one (this=0x5562d2888000, lock=..., this_thread=..., ec=...) at src/third_party/asio-master/asio/include/asio/detail/impl/scheduler.ipp:400
#40 0x00007f99c1cb8468 in asio::detail::scheduler::run (this=0x5562d2888000, ec=...) at src/third_party/asio-master/asio/include/asio/detail/impl/scheduler.ipp:153
#41 0x00007f99c1cb8326 in asio::io_context::run (this=0x5562d291eda8) at src/third_party/asio-master/asio/include/asio/impl/io_context.ipp:61
#42 0x00007f99c881faec in mongo::transport::TransportLayerASIO::ASIOReactor::run (this=<optimized out>) at src/mongo/transport/transport_layer_asio.cpp:195
#43 0x00007f99c0edeebe in mongo::executor::NetworkInterfaceTL::_run (this=0x5562d25a3000) at src/mongo/executor/network_interface_tl.cpp:261
#44 0x00007f99c0ef6113 in mongo::executor::NetworkInterfaceTL::startup()::$_1::operator()() const (this=<optimized out>) at src/mongo/executor/network_interface_tl.cpp:249
...



 Comments   
Comment by Andrew Shuvalov (Inactive) [ 02/Nov/22 ]

Confirmed this was DNS delay problem fixed in SERVER-69584. Tried to reproduce as-is in several past commits in our catalog shard POC where I detected the failure, and it always worked just fine.

Generated at Thu Feb 08 06:17:05 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.