Rollback can stall on app eviction

XMLWordPrintableJSON

    • Type: Task
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • Storage Engines, Storage Engines - Server Integration
    • SESI - 2025-06-24, SESI - 2025-07-22
    • None
    • 3
    • TBD
    • None
    • None
    • None
    • None
    • None
    • None

      When increasing the size and iterations in jstests/noPassthrough/txns_cache_errors/transaction_aborted_under_cache_pressure.js, we can get coopted into eviction and freeze the server.

      This appears to occur on the regular eviction path, and was called out in WT-13975. The title of SERVER-98113 implies that this case should be handled but the body of the ticket indicates that we only targeted rollback for prepared transactions. We should determine if this is a path we want to opt out of eviction on

      Possible resolutions:

      • Allow us to skip eviction on rollback
      • Tweak the parameters of the idle trx killer to kill transactions in these situations

       

      It's not clear to me exactly why rollback is being called, it seems related to the handling of multiple transactions on a single conn

       

      Stack of conn stuck in eviction

      #0  __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0xf5d205f42298, op=137, expected=0, futex_word=0x51b53fb53654) at ./nptl/futex-internal.c:57
      #1  __futex_abstimed_wait_common (cancel=true, private=0, abstime=0xf5d205f42298, clockid=681549272, expected=0, futex_word=0x51b53fb53654) at ./nptl/futex-internal.c:87
      #2  __GI___futex_abstimed_wait_cancelable64 (futex_word=futex_word@entry=0x51b53fb53654, expected=expected@entry=0, clockid=clockid@entry=1, abstime=abstime@entry=0xf5d205f42298, private=private@entry=0) at ./nptl/futex-internal.c:139
      #3  0x0000f5d2289fcc10 in __pthread_cond_wait_common (abstime=0xf5d205f42298, clockid=1, mutex=0x51b53fb535f8, cond=0x51b53fb53628) at ./nptl/pthread_cond_wait.c:503
      #4  ___pthread_cond_timedwait64 (cond=0x51b53fb53628, mutex=0x51b53fb535f8, abstime=0xf5d205f42298) at ./nptl/pthread_cond_wait.c:652
      #5  0x0000b9579417b2f0 in __wt_cond_wait_signal (session=0x51b53c01c920, cond=0x51b53fb535f0, usecs=10000, run_func=0x0, signalled=0xf5d205f426f7) at src/third_party/wiredtiger/src/os_posix/os_mtx_cond.c:115
      #6  0x0000b95793fe559c in __wt_cond_wait (session=0x51b53c01c920, cond=0x51b53fb535f0, usecs=10000, run_func=0x0) at src/third_party/wiredtiger/src/include/misc_inline.h:21
      #7  0x0000b95793fe3fe8 in __wti_evict_app_assist_worker (session=0x51b53c01c920, busy=false, readonly=false, interruptible=true) at src/third_party/wiredtiger/src/evict/evict_lru.c:2950
      #8  0x0000b95794156488 in __wt_evict_app_assist_worker_check (session=0x51b53c01c920, busy=false, readonly=false, interruptible=true, didworkp=0x0) at src/third_party/wiredtiger/src/include/../evict/evict_inline.h:729
      #9  0x0000b95794156e98 in __wt_txn_rollback (session=0x51b53c01c920, cfg=0xf5d205f44de0) at src/third_party/wiredtiger/src/txn/txn.c:2275
      #10 0x0000b957940d4348 in __session_rollback_transaction (wt_session=0x51b53c01c920, config=0x0) at src/third_party/wiredtiger/src/session/session_api.c:1963
      #11 0x0000b95793d11bcc in mongo::WiredTigerSession::rollback_transaction<decltype(nullptr)>(decltype(nullptr)&&) (this=0x51b53429af80, args=<error reading variable: Attempt to dereference a generic pointer.>) at src/mongo/db/storage/wiredtiger/wiredtiger_session.h:146
      #12 0x0000b95793daf464 in mongo::WiredTigerRecoveryUnit::_txnClose (this=0x51b5293fa000, commit=false) at src/mongo/db/storage/wiredtiger/wiredtiger_recovery_unit.cpp:374
      #13 0x0000b95793dae648 in mongo::WiredTigerRecoveryUnit::_abort (this=0x51b5293fa000) at src/mongo/db/storage/wiredtiger/wiredtiger_recovery_unit.cpp:152
      #14 0x0000b95793db01bc in mongo::WiredTigerRecoveryUnit::doAbortUnitOfWork (this=0x51b5293fa000) at src/mongo/db/storage/wiredtiger/wiredtiger_recovery_unit.cpp:204
      #15 0x0000b9579aa4b9a4 in mongo::RecoveryUnit::abortUnitOfWork (this=0x51b5293fa000) at src/mongo/db/storage/recovery_unit.cpp:149
      #16 0x0000b9579aa49c30 in mongo::WriteUnitOfWork::~WriteUnitOfWork (this=0x51b534294780) at src/mongo/db/storage/write_unit_of_work.cpp:76
      #17 0x0000b95794fd653c in std::default_delete<mongo::WriteUnitOfWork>::operator() (this=0x51b5353f77c0, __ptr=0x51b534294780) at external/mongo_toolchain_v5/stow/gcc-v5/include/c++/14.2.0/bits/unique_ptr.h:93
      #18 0x0000b95796a091f4 in std::__uniq_ptr_impl<mongo::WriteUnitOfWork, std::default_delete<mongo::WriteUnitOfWork> >::reset (this=0x51b5353f77c0, __p=0x0) at external/mongo_toolchain_v5/stow/gcc-v5/include/c++/14.2.0/bits/unique_ptr.h:205
      #19 0x0000b95796a09174 in std::__uniq_ptr_impl<mongo::WriteUnitOfWork, std::default_delete<mongo::WriteUnitOfWork> >::operator= (this=0x51b5353f77c0, __u=...) at external/mongo_toolchain_v5/stow/gcc-v5/include/c++/14.2.0/bits/unique_ptr.h:185
      #20 0x0000b95796a09134 in std::__uniq_ptr_data<mongo::WriteUnitOfWork, std::default_delete<mongo::WriteUnitOfWork>, true, true>::operator= (this=0x51b5353f77c0) at external/mongo_toolchain_v5/stow/gcc-v5/include/c++/14.2.0/bits/unique_ptr.h:237
      #21 0x0000b95796a09100 in std::unique_ptr<mongo::WriteUnitOfWork, std::default_delete<mongo::WriteUnitOfWork> >::operator= (this=0x51b5353f77c0) at external/mongo_toolchain_v5/stow/gcc-v5/include/c++/14.2.0/bits/unique_ptr.h:408
      #22 0x0000b95796a08db0 in mongo::OperationContext::setWriteUnitOfWork_DO_NOT_USE (this=0x51b5353f7680, writeUnitOfWork=...) at src/mongo/db/operation_context.h:348
      #23 0x0000b95796a06f60 in mongo::shard_role_details::setWriteUnitOfWork (opCtx=0x51b5353f7680, writeUnitOfWork=...) at src/mongo/db/transaction_resources.cpp:136
      #24 0x0000b95794fb3288 in mongo::TransactionParticipant::Participant::_cleanUpTxnResourceOnOpCtx (this=0xf5d205f47470, opCtx=0x51b5353f7680, terminationCause=mongo::TerminationCause::kAborted, isSplitPreparedTxn=false) at src/mongo/db/transaction/transaction_participant.cpp:2643
      #25 0x0000b95794fb41f4 in mongo::TransactionParticipant::Participant::_abortActiveTransaction (this=0xf5d205f47470, opCtx=0x51b5353f7680, expectedStates=2) at src/mongo/db/transaction/transaction_participant.cpp:2496
      #26 0x0000b95794fb3944 in mongo::TransactionParticipant::Participant::abortTransaction (this=0xf5d205f47470, opCtx=0x51b5353f7680, status=Status::OK()) at src/mongo/db/transaction/transaction_participant.cpp:2396
      #27 0x0000b95793b6c25c in mongo::(anonymous namespace)::CheckoutSessionAndInvokeCommand::_cleanupTransaction (this=0xf5d205f474e0, txnParticipant=...) at src/mongo/db/service_entry_point_shard_role.cpp:898
      #28 0x0000b95793b6a048 in mongo::(anonymous namespace)::CheckoutSessionAndInvokeCommand::~CheckoutSessionAndInvokeCommand (this=0xf5d205f474e0) at src/mongo/db/service_entry_point_shard_role.cpp:796
      #29 0x0000b95793b69ddc in mongo::(anonymous namespace)::RunCommandImpl::_runCommand (this=0xf5d205f47a10) at src/mongo/db/service_entry_point_shard_role.cpp:1283
      #30 0x0000b95793b6d2d8 in mongo::(anonymous namespace)::RunCommandAndWaitForWriteConcern::_runCommandWithFailPoint (this=0xf5d205f47a10) at src/mongo/db/service_entry_point_shard_role.cpp:1405
      #31 0x0000b95793b6c90c in mongo::(anonymous namespace)::RunCommandAndWaitForWriteConcern::_runImpl()::$_0::operator()() const (this=0xf5d205f47758) at src/mongo/db/service_entry_point_shard_role.cpp:1321
      #32 0x0000b95793b69ba4 in mongo::(anonymous namespace)::RunCommandAndWaitForWriteConcern::_runImpl (this=0xf5d205f47a10) at src/mongo/db/service_entry_point_shard_role.cpp:1319
      #33 0x0000b95793b6e608 in mongo::(anonymous namespace)::RunCommandImpl::run()::{lambda()#1}::operator()() const (this=0xf5d205f47878) at src/mongo/db/service_entry_point_shard_role.cpp:699
      #34 0x0000b95793b685c0 in mongo::(anonymous namespace)::RunCommandImpl::run (this=0xf5d205f47a10) at src/mongo/db/service_entry_point_shard_role.cpp:696
      #35 0x0000b95793b66770 in mongo::(anonymous namespace)::ExecCommandDatabase::_commandExec (this=0xf5d205f48010) at src/mongo/db/service_entry_point_shard_role.cpp:1907
      #36 0x0000b95793b63b64 in mongo::(anonymous namespace)::ExecCommandDatabase::run()::{lambda()#1}::operator()() const (this=0xf5d205f47bc8) at src/mongo/db/service_entry_point_shard_role.cpp:505
      #37 0x0000b95793b630a8 in mongo::(anonymous namespace)::ExecCommandDatabase::run (this=0xf5d205f48010) at src/mongo/db/service_entry_point_shard_role.cpp:502
      #38 0x0000b95793b622f8 in mongo::(anonymous namespace)::executeCommand (execContext=...) at src/mongo/db/service_entry_point_shard_role.cpp:2231
      #39 0x0000b95793b614f4 in mongo::(anonymous namespace)::receivedCommands (execContext=...) at src/mongo/db/service_entry_point_shard_role.cpp:2302
      #40 0x0000b95793b6093c in mongo::(anonymous namespace)::HandleRequest::runOperation (this=0xf5d205f486a8) at src/mongo/db/service_entry_point_shard_role.cpp:2361
      #41 0x0000b95793b60270 in mongo::ServiceEntryPointShardRole::handleRequest (this=0x51b53fe2b618, opCtx=0x51b5353f7680, m=...) at src/mongo/db/service_entry_point_shard_role.cpp:2465
      #42 0x0000b957999ccb2c in mongo::transport::SessionWorkflow::Impl::_dispatchWork (this=0x51b53fbadda0) at src/mongo/transport/session_workflow.cpp:723
      #43 0x0000b957999ce980 in mongo::transport::SessionWorkflow::Impl::_doOneIteration()::$_0::operator()<std::unique_ptr<mongo::transport::SessionWorkflow::Impl::WorkItem, std::default_delete<mongo::transport::SessionWorkflow::Impl::WorkItem> > >(std::unique_ptr<mongo::transport::SessionWorkflow::Impl::WorkItem, std::default_delete<mongo::transport::SessionWorkflow::Impl::WorkItem> >) const (this=0xf5d205f48c40, work=...) at src/mongo/transport/session_workflow.cpp:786
      #44 0x0000b957999ce874 in mongo::future_details::call<mongo::transport::SessionWorkflow::Impl::_doOneIteration()::$_0&, std::unique_ptr<mongo::transport::SessionWorkflow::Impl::WorkItem, std::default_delete<mongo::transport::SessionWorkflow::Impl::WorkItem> > >(mongo::transport::SessionWorkflow::Impl::_doOneIteration()::$_0&, std::unique_ptr<mongo::transport::SessionWorkflow::Impl::WorkItem, std::default_delete<mongo::transport::SessionWorkflow::Impl::WorkItem> >&&) (func=..., arg=...) at src/mongo/util/future_impl.h:253
      #45 0x0000b957999ce814 in mongo::future_details::throwingCall<mongo::transport::SessionWorkflow::Impl::_doOneIteration()::$_0&, std::unique_ptr<mongo::transport::SessionWorkflow::Impl::WorkItem, std::default_delete<mongo::transport::SessionWorkflow::Impl::WorkItem> > >(mongo::transport::SessionWorkflow::Impl::_doOneIteration()::$_0&, std::unique_ptr<mongo::transport::SessionWorkflow::Impl::WorkItem, std::default_delete<mongo::transport::SessionWorkflow::Impl::WorkItem> >&&) (func=..., args=...) at src/mongo/util/future_impl.h:311
      #46 0x0000b957999ce55c in mongo::future_details::FutureImpl<std::unique_ptr<mongo::transport::SessionWorkflow::Impl::WorkItem, std::default_delete<mongo::transport::SessionWorkflow::Impl::WorkItem> > >::then<mongo::transport::SessionWorkflow::Impl::_doOneIteration()::$_0>(mongo::transport::SessionWorkflow::Impl::_doOneIteration()::$_0&&) &&::{lambda(std::unique_ptr<mongo::transport::SessionWorkflow::Impl::WorkItem, std::default_delete<mongo::transport::SessionWorkflow::Impl::WorkItem> >&&)#1}::operator()(std::unique_ptr<mongo::transport::SessionWorkflow::Impl::WorkItem, std::default_delete<mongo::transport::SessionWorkflow::Impl::WorkItem> >&&) const (this=0xf5d205f48b60, val=...) at src/mongo/util/future_impl.h:967
      #47 0x0000b957999ce33c in mongo::future_details::FutureImpl<std::unique_ptr<mongo::transport::SessionWorkflow::Impl::WorkItem, std::default_delete<mongo::transport::SessionWorkflow::Impl::WorkItem> > >::generalImpl<mongo::future_details::FutureImpl<std::unique_ptr<mongo::transport::SessionWorkflow::Impl::WorkItem, std::default_delete<mongo::transport::SessionWorkflow::Impl::WorkItem> > >::then<mongo::transport::SessionWorkflow::Impl::_doOneIteration()::$_0>(mongo::transport::SessionWorkflow::Impl::_doOneIteration()::$_0&&) &&::{lambda(std::unique_ptr<mongo::transport::SessionWorkflow::Impl::WorkItem, std::default_delete<mongo::transport::SessionWorkflow::Impl::WorkItem> >&&)#1}, mongo::future_details::FutureImpl<std::unique_ptr<mongo::transport::SessionWorkflow::Impl::WorkItem, std::default_delete<mongo::transport::SessionWorkflow::Impl::WorkItem> > >::then<mongo::transport::SessionWorkflow::Impl::_doOneIteration()::$_0>(mongo::transport::SessionWorkflow::Impl::_doOneIteration()::$_0&&) &&::{lambda(mongo::Status&&)#1}, mongo::future_details::FutureImpl<std::unique_ptr<mongo::transport::SessionWorkflow::Impl::WorkItem, std::default_delete<mongo::transport::SessionWorkflow::Impl::WorkItem> > >::then<mongo::transport::SessionWorkflow::Impl::_doOneIteration()::$_0>(mongo::transport::SessionWorkflow::Impl::_doOneIteration()::$_0&&) &&::{lambda()#1}>(mongo::transport::SessionWorkflow::Impl::_doOneIteration()::$_0&&, mongo::future_details::FutureImpl<std::unique_ptr<mongo::transport::SessionWorkflow::Impl::WorkItem, std::default_delete<mongo::transport::SessionWorkflow::Impl::WorkItem> > >::then<mongo::transport::SessionWorkflow::Impl::_doOneIteration()::$_0>(mongo::transport::SessionWorkflow::Impl::_doOneIteration()::$_0&&) &&::{lambda(mongo::Status&&)#1}&&, mongo::future_details::FutureImpl<std::unique_ptr<mongo::transport::SessionWorkflow::Impl::WorkItem, std::default_delete<mongo::transport::SessionWorkflow::Impl::WorkItem> > >::then<mongo::transport::SessionWorkflow::Impl::_doOneIteration()::$_0>(mongo::transport::SessionWorkflow::Impl::_doOneIteration()::$_0&&) &&::{lambda()#1}&&) (this=0xf5d205f48c48, success=..., fail=..., notReady=...)
          at src/mongo/util/future_impl.h:1245
      #48 0x0000b957999ce274 in mongo::future_details::FutureImpl<std::unique_ptr<mongo::transport::SessionWorkflow::Impl::WorkItem, std::default_delete<mongo::transport::SessionWorkflow::Impl::WorkItem> > >::then<mongo::transport::SessionWorkflow::Impl::_doOneIteration()::$_0>(mongo::transport::SessionWorkflow::Impl::_doOneIteration()::$_0&&) && (
          this=0xf5d205f48c48, func=...) at src/mongo/util/future_impl.h:963
      #49 0x0000b957999cd4e4 in _ZNO5mongo6FutureISt10unique_ptrINS_9transport15SessionWorkflow4Impl8WorkItemESt14default_deleteIS5_EEE4thenIZNS4_15_doOneIterationEvE3$_0Qsr14future_detailsE10isCallableITL0__T_EEEDaOSD_ (this=0xf5d205f48c48, func=...) at src/mongo/util/future.h:417
      #50 0x0000b957999cd450 in mongo::transport::SessionWorkflow::Impl::_doOneIteration (this=0x51b53fbadda0) at src/mongo/transport/session_workflow.cpp:782
      #51 0x0000b957999d2414 in mongo::transport::SessionWorkflow::Impl::_scheduleIteration()::$_0::operator()(mongo::Status) const (this=0x51b534267a38, status=Status::OK()) at src/mongo/transport/session_workflow.cpp:827
      #52 0x0000b957999d222c in mongo::unique_function<void (mongo::Status)>::makeImpl<mongo::transport::SessionWorkflow::Impl::_scheduleIteration()::$_0>(mongo::transport::SessionWorkflow::Impl::_scheduleIteration()::$_0&&)::SpecificImpl::call(mongo::Status&&) (this=0x51b534267a30, args=Status::OK()) at src/mongo/util/functional.h:262
      #53 0x0000b957911762a4 in mongo::unique_function<void (mongo::Status)>::operator()(mongo::Status) const (this=0x51b53faae5e0, args=Status::OK()) at src/mongo/util/functional.h:220
      #54 0x0000b957999d7ef0 in mongo::transport::SessionWorkflow::Impl::_captureContext(mongo::unique_function<void (mongo::Status)>)::{lambda(mongo::Status)#1}::operator()(mongo::Status)::{lambda()#1}::operator()() const (this=0xf5d205f48e88) at src/mongo/transport/session_workflow.cpp:483
      #55 0x0000b957999d7e50 in mongo::ClientStrand::run<mongo::transport::SessionWorkflow::Impl::_captureContext(mongo::unique_function<void (mongo::Status)>)::{lambda(mongo::Status)#1}::operator()(mongo::Status)::{lambda()#1}>(mongo::transport::SessionWorkflow::Impl::_captureContext(mongo::unique_function<void (mongo::Status)>)::{lambda(mongo::Status)#1}::operator()(mongo::Status)::{lambda()#1}) (this=0x51b53fbb6400, task=...) at src/mongo/db/client_strand.h:177
      #56 0x0000b957999d7e00 in mongo::transport::SessionWorkflow::Impl::_captureContext(mongo::unique_function<void (mongo::Status)>)::{lambda(mongo::Status)#1}::operator()(mongo::Status) (this=0x51b53faae5c8, st=Status::OK()) at src/mongo/transport/session_workflow.cpp:483
      #57 0x0000b957999d7cd8 in mongo::unique_function<void (mongo::Status)>::makeImpl<mongo::transport::SessionWorkflow::Impl::_captureContext(mongo::unique_function<void (mongo::Status)>)::{lambda(mongo::Status)#1}>(mongo::transport::SessionWorkflow::Impl::_captureContext(mongo::unique_function<void (mongo::Status)>)::{lambda(mongo::Status)#1}&&)::SpecificImpl::call(mongo::Status&&) (this=0x51b53faae5c0, args=Status::OK()) at src/mongo/util/functional.h:262
      #58 0x0000b957911762a4 in mongo::unique_function<void (mongo::Status)>::operator()(mongo::Status) const (this=0x51b5345f5d20, args=Status::OK()) at src/mongo/util/functional.h:220
      #59 0x0000b95799af691c in mongo::transport::service_executor_synchronous_detail::ServiceExecutorSyncImpl::SharedState::WorkerThreadInfo::run (this=0x51b53fbade00) at src/mongo/transport/service_executor_synchronous.cpp:116
      #60 0x0000b95799af3668 in mongo::transport::service_executor_synchronous_detail::ServiceExecutorSyncImpl::SharedState::schedule(mongo::unique_function<void (mongo::Status)>, mongo::StringData)::$_0::operator()() const (this=0x51b53fbb1ae8) at src/mongo/transport/service_executor_synchronous.cpp:152
      #61 0x0000b95799af471c in mongo::unique_function<void ()>::makeImpl<mongo::transport::service_executor_synchronous_detail::ServiceExecutorSyncImpl::SharedState::schedule(mongo::unique_function<void (mongo::Status)>, mongo::StringData)::$_0>(mongo::transport::service_executor_synchronous_detail::ServiceExecutorSyncImpl::SharedState::schedule(mongo::unique_function<void (mongo::Status)>, mongo::StringData)::$_0&&)::SpecificImpl::call() (this=0x51b53fbb1ae0) at src/mongo/util/functional.h:262
      #62 0x0000b95791b53748 in mongo::unique_function<void ()>::operator()() const (this=0x51b53fbbdf78) at src/mongo/util/functional.h:220
      #63 0x0000b95799af9d5c in mongo::transport::launchServiceWorkerThread(mongo::unique_function<void ()>)::$_0::operator()() (this=0x51b53fbbdf68) at src/mongo/transport/service_executor_utils.cpp:119
      #64 0x0000b95799af9d10 in mongo::unique_function<void ()>::makeImpl<mongo::transport::launchServiceWorkerThread(mongo::unique_function<void ()>)::$_0>(mongo::transport::launchServiceWorkerThread(mongo::unique_function<void ()>)::$_0&&)::SpecificImpl::call() (this=0x51b53fbbdf60) at src/mongo/util/functional.h:262
      #65 0x0000b95791b53748 in mongo::unique_function<void ()>::operator()() const (this=0x51b53fe2ada0) at src/mongo/util/functional.h:220
      #66 0x0000b95799af999c in mongo::transport::(anonymous namespace)::runFunc (ctx=0x51b53fe2ada0) at src/mongo/transport/service_executor_utils.cpp:63
      #67 0x0000f5d2289fd5c8 in start_thread (arg=0x0) at ./nptl/pthread_create.c:442
      #68 0x0000f5d228a65edc in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/clone.S:79 

            Assignee:
            Aaron Balsara
            Reporter:
            Aaron Balsara
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated: