|
Looks like the TenantOplogApplier deadlocks with itself on shutdown.
#41 0x00007f72fbc6e924 in mongo::future_details::SharedStateBase::transitionToFinished (this=0x5613ffeca180) at src/mongo/util/future_impl.h:456
|
#42 0x00007f72f314afcf in mongo::future_details::SharedStateBase::setError (statusArg=..., this=0x5613ffeca180) at src/mongo/util/future_impl.h:467
|
#43 mongo::SharedPromise<mongo::repl::TenantOplogApplier::OpTimePair>::setError (status=..., this=0x561403aa4d70) at src/mongo/util/future.h:1139
|
#44 mongo::repl::TenantOplogApplier::_finishShutdown (this=0x5613ffda68d0, lk=..., status=...) at src/mongo/db/repl/tenant_oplog_applier.cpp:240
|
#45 0x00007f72f3138855 in mongo::repl::TenantOplogApplier::_doShutdown_inlock (this=0x5613ffda68d0) at src/mongo/util/concurrency/with_lock.h:100
|
#46 mongo::repl::TenantOplogApplier::_doShutdown_inlock (this=0x5613ffda68d0) at src/mongo/db/repl/tenant_oplog_applier.cpp:141
|
#47 0x00007f72f7798a80 in mongo::repl::AbstractAsyncComponent::shutdown (this=this@entry=0x5613ffda68d0) at src/mongo/db/repl/abstract_async_component.cpp:115
|
#48 0x00007f72f314c9be in mongo::repl::TenantOplogApplier::_shouldStopApplying (this=0x5613ffda68d0, status=Status(InterruptedDueToReplStateChange, "operation was interrupted")) at src/mongo/db/repl/tenant_oplog_applier.cpp:222
|
#49 0x00007f72f314f230 in mongo::repl::TenantOplogApplier::_applyLoop (this=0x5613ffda68d0, batch=...) at /opt/mongodbtoolchain/revisions/32eb70c47bd9e9759dd05654843feb80461aaef3/stow/gcc-v3.P9L/include/c++/8.3.0/bits/atomic_base.h:512
|
Fulfilling a TenantOplogApplier promise runs the continuation inline and that in turns tries to shut down the TenantOplogApplier.
#5 mongo::latch_detail::Mutex::lock (this=0x5613ffda69a0) at src/mongo/platform/mutex.cpp:66
|
#6 0x00007f72f7798a3c in std::lock_guard<mongo::latch_detail::Latch>::lock_guard (__m=..., this=<synthetic pointer>) at /opt/mongodbtoolchain/revisions/32eb70c47bd9e9759dd05654843feb80461aaef3/stow/gcc-v3.P9L/include/c++/8.3.0/bits/std_mutex.h:161
|
#7 mongo::repl::AbstractAsyncComponent::shutdown (this=0x5613ffda68d0) at src/mongo/db/repl/abstract_async_component.cpp:100
|
#8 0x00007f72f8c52d28 in mongo::repl::(anonymous namespace)::shutdownTarget<std::shared_ptr<mongo::repl::TenantOplogApplier> > (lk=..., target=std::shared_ptr<mongo::repl::TenantOplogApplier> (use count 3, weak count 1) = {...}) at src/mongo/db/repl/tenant_migration_recipient_service.cpp:1452
|
#9 mongo::repl::TenantMigrationRecipientService::Instance::_cancelRemainingWork (this=0x5614048be010, lk=...) at src/mongo/db/repl/tenant_migration_recipient_service.cpp:1558
|
#10 0x00007f72f8c4f538 in mongo::repl::TenantMigrationRecipientService::Instance::<lambda(mongo::Status)>::operator() (__closure=0x561403512e60, status=...) at src/mongo/util/invariant.h:66
|
#11 mongo::future_util_details::AsyncTryUntilWithDelay<mongo::repl::TenantMigrationRecipientService::Instance::run(std::shared_ptr<mongo::executor::ScopedTaskExecutor>, const mongo::CancelationToken&)::<lambda()>, mongo::repl::TenantMigrationRecipientService::Instance::run(std::shared_ptr<mongo::executor::ScopedTaskExecutor>, const mongo::CancelationToken&)::<lambda(mongo::Status)>,
|
I think this was caused by SERVER-54735 which was reverted after the base of the patch build. And essentially this is the same problem as SERVER-55205. I think this is fixed now. cheahuychou.mao, do you know if you have seen this recently?
jason.chan Can you confirm that after the revert of SERVER-54735, we no longer call the until-block of the AsyncTry inline?
|