Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-7982

Fix heap-use-after-free bug when sweeping cursor cache (4.2 MongoDB, ASAN)

    • Type: Icon: Bug Bug
    • Resolution: Duplicate
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Labels:
      None
    • Storage - Ra 2021-09-06

      A build failure (BF) detected an AddressSanitizer heap-use-after-free error in the MongoDB 4.2 release, during the cursor cache sweeping logic.

      The heap-use-after-free access stack trace:

       ==120483==ERROR: AddressSanitizer: heap-use-after-free on address 0x6140003c50d0 at pc 0x55adb3623fe7 bp 0x7f9b03d2ad00 sp 0x7f9b03d2acf8
          #0 0x55adb3623fe6 in __wt_cursor_dhandle_decr_use /data/mci/40ae47ae894f40074930da026dfd8cc7/src/src/third_party/wiredtiger/src/include/cursor.i:311:73
          #1 0x55adb3620f0b in __curfile_close /data/mci/40ae47ae894f40074930da026dfd8cc7/src/src/third_party/wiredtiger/src/cursor/cur_file.c:522:9
          #2 0x55adb376d775 in __wt_session_cursor_cache_sweep /data/mci/40ae47ae894f40074930da026dfd8cc7/src/src/third_party/wiredtiger/src/session/session_api.c:96:17
          #3 0x55adb378c0aa in __session_reset /data/mci/40ae47ae894f40074930da026dfd8cc7/src/src/third_party/wiredtiger/src/session/session_api.c:974:9
          #4 0x55adb3570ef9 in mongo::WiredTigerSessionCache::releaseSession(mongo::WiredTigerSession*) /data/mci/42c8913417f9064a51396a1aded9cde0/src/src/mongo/db/storage/wiredtiger/wiredtiger_session_cache.cpp:449:9
          #5 0x55adb354868e in std::unique_ptr<mongo::WiredTigerSession, mongo::WiredTigerSessionCache::WiredTigerSessionDeleter>::~unique_ptr() /opt/mongodbtoolchain/revisions/0d5a071f1db663c050a1d7f330c13f46e62d6d4f/stow/gcc-v3.ulc/lib/gcc/x86_64-mongodb-linux/8.5.0/../../../../include/c++/8.5.0/bits/unique_ptr.h:277:4
          #6 0x55adb354868e in mongo::WiredTigerRecoveryUnit::~WiredTigerRecoveryUnit() /data/mci/40ae47ae894f40074930da026dfd8cc7/src/src/mongo/db/storage/wiredtiger/wiredtiger_recovery_unit.cpp:195
          #7 0x55adb3549b7f in mongo::WiredTigerRecoveryUnit::~WiredTigerRecoveryUnit() /data/mci/40ae47ae894f40074930da026dfd8cc7/src/src/mongo/db/storage/wiredtiger/wiredtiger_recovery_unit.cpp:192:51
          #8 0x55adb3532b6d in std::default_delete<mongo::RecoveryUnit>::operator()(mongo::RecoveryUnit*) const /opt/mongodbtoolchain/revisions/0d5a071f1db663c050a1d7f330c13f46e62d6d4f/stow/gcc-v3.ulc/lib/gcc/x86_64-mongodb-linux/8.5.0/../../../../include/c++/8.5.0/bits/unique_ptr.h:81:2
          #9 0x55adb3532b6d in std::unique_ptr<mongo::RecoveryUnit, std::default_delete<mongo::RecoveryUnit> >::~unique_ptr() /opt/mongodbtoolchain/revisions/0d5a071f1db663c050a1d7f330c13f46e62d6d4f/stow/gcc-v3.ulc/lib/gcc/x86_64-mongodb-linux/8.5.0/../../../../include/c++/8.5.0/bits/unique_ptr.h:277
          #10 0x55adb71427cd in mongo::OperationContext::~OperationContext() /data/mci/40ae47ae894f40074930da026dfd8cc7/src/src/mongo/db/operation_context.cpp:92:37
          #11 0x55adb714289f in mongo::OperationContext::~OperationContext() /data/mci/40ae47ae894f40074930da026dfd8cc7/src/src/mongo/db/operation_context.cpp:92:37
          #12 0x55adb715aaed in mongo::ServiceContext::OperationContextDeleter::operator()(mongo::OperationContext*) const /data/mci/40ae47ae894f40074930da026dfd8cc7/src/src/mongo/db/service_context.cpp:301:5
          #13 0x55adb312a231 in std::unique_ptr<mongo::OperationContext, mongo::ServiceContext::OperationContextDeleter>::~unique_ptr() /opt/mongodbtoolchain/revisions/0d5a071f1db663c050a1d7f330c13f46e62d6d4f/stow/gcc-v3.ulc/lib/gcc/x86_64-mongodb-linux/8.5.0/../../../../include/c++/8.5.0/bits/unique_ptr.h:277:4
          #14 0x55adb312a231 in mongo::repl::SyncTail::OpQueueBatcher::run() /data/mci/42c8913417f9064a51396a1aded9cde0/src/src/mongo/db/repl/sync_tail.cpp:670
          #15 0x55adb31299b8 in mongo::repl::SyncTail::OpQueueBatcher::OpQueueBatcher(mongo::repl::SyncTail*, mongo::repl::StorageInterface*, mongo::repl::OplogBuffer*, std::function<mongo::StatusWith<std::vector<mongo::repl::OplogEntry, std::allocator<mongo::repl::OplogEntry> > > (mongo::OperationContext*, mongo::repl::OplogApplier::BatchLimits const&)>)::'lambda'()::operator()() const /data/mci/42c8913417f9064a51396a1aded9cde0/src/src/mongo/db/repl/sync_tail.cpp:613:28
          #16 0x55adb31299b8 in void std::__invoke_impl<void, mongo::repl::SyncTail::OpQueueBatcher::OpQueueBatcher(mongo::repl::SyncTail*, mongo::repl::StorageInterface*, mongo::repl::OplogBuffer*, std::function<mongo::StatusWith<std::vector<mongo::repl::OplogEntry, std::allocator<mongo::repl::OplogEntry> > > (mongo::OperationContext*, mongo::repl::OplogApplier::BatchLimits const&)>)::'lambda'()>(std::__invoke_other, mongo::repl::SyncTail::OpQueueBatcher::OpQueueBatcher(mongo::repl::SyncTail*, mongo::repl::StorageInterface*, mongo::repl::OplogBuffer*, std::function<mongo::StatusWith<std::vector<mongo::repl::OplogEntry, std::allocator<mongo::repl::OplogEntry> > > (mongo::OperationContext*, mongo::repl::OplogApplier::BatchLimits const&)>)::'lambda'()&&) /opt/mongodbtoolchain/revisions/0d5a071f1db663c050a1d7f330c13f46e62d6d4f/stow/gcc-v3.ulc/lib/gcc/x86_64-mongodb-linux/8.5.0/../../../../include/c++/8.5.0/bits/invoke.h:60
          #17 0x55adb31299b8 in std::__invoke_result<mongo::repl::SyncTail::OpQueueBatcher::OpQueueBatcher(mongo::repl::SyncTail*, mongo::repl::StorageInterface*, mongo::repl::OplogBuffer*, std::function<mongo::StatusWith<std::vector<mongo::repl::OplogEntry, std::allocator<mongo::repl::OplogEntry> > > (mongo::OperationContext*, mongo::repl::OplogApplier::BatchLimits const&)>)::'lambda'()>::type std::__invoke<mongo::repl::SyncTail::OpQueueBatcher::OpQueueBatcher(mongo::repl::SyncTail*, mongo::repl::StorageInterface*, mongo::repl::OplogBuffer*, std::function<mongo::StatusWith<std::vector<mongo::repl::OplogEntry, std::allocator<mongo::repl::OplogEntry> > > (mongo::OperationContext*, mongo::repl::OplogApplier::BatchLimits const&)>)::'lambda'()>(mongo::repl::SyncTail::OpQueueBatcher::OpQueueBatcher(mongo::repl::SyncTail*, mongo::repl::StorageInterface*, mongo::repl::OplogBuffer*, std::function<mongo::StatusWith<std::vector<mongo::repl::OplogEntry, std::allocator<mongo::repl::OplogEntry> > > (mongo::OperationContext*, mongo::repl::OplogApplier::BatchLimits const&)>)::'lambda'()&&) /opt/mongodbtoolchain/revisions/0d5a071f1db663c050a1d7f330c13f46e62d6d4f/stow/gcc-v3.ulc/lib/gcc/x86_64-mongodb-linux/8.5.0/../../../../include/c++/8.5.0/bits/invoke.h:95
          #18 0x55adb31299b8 in decltype(auto) std::__apply_impl<mongo::repl::SyncTail::OpQueueBatcher::OpQueueBatcher(mongo::repl::SyncTail*, mongo::repl::StorageInterface*, mongo::repl::OplogBuffer*, std::function<mongo::StatusWith<std::vector<mongo::repl::OplogEntry, std::allocator<mongo::repl::OplogEntry> > > (mongo::OperationContext*, mongo::repl::OplogApplier::BatchLimits const&)>)::'lambda'(), std::tuple<> >(mongo::repl::SyncTail::OpQueueBatcher::OpQueueBatcher(mongo::repl::SyncTail*, mongo::repl::StorageInterface*, mongo::repl::OplogBuffer*, std::function<mongo::StatusWith<std::vector<mongo::repl::OplogEntry, std::allocator<mongo::repl::OplogEntry> > > (mongo::OperationContext*, mongo::repl::OplogApplier::BatchLimits const&)>)::'lambda'()&&, std::tuple<>&&, std::integer_sequence<unsigned long>) /opt/mongodbtoolchain/revisions/0d5a071f1db663c050a1d7f330c13f46e62d6d4f/stow/gcc-v3.ulc/lib/gcc/x86_64-mongodb-linux/8.5.0/../../../../include/c++/8.5.0/tuple:1678
          #19 0x55adb31299b8 in decltype(auto) std::apply<mongo::repl::SyncTail::OpQueueBatcher::OpQueueBatcher(mongo::repl::SyncTail*, mongo::repl::StorageInterface*, mongo::repl::OplogBuffer*, std::function<mongo::StatusWith<std::vector<mongo::repl::OplogEntry, std::allocator<mongo::repl::OplogEntry> > > (mongo::OperationContext*, mongo::repl::OplogApplier::BatchLimits const&)>)::'lambda'(), std::tuple<> >(mongo::repl::SyncTail::OpQueueBatcher::OpQueueBatcher(mongo::repl::SyncTail*, mongo::repl::StorageInterface*, mongo::repl::OplogBuffer*, std::function<mongo::StatusWith<std::vector<mongo::repl::OplogEntry, std::allocator<mongo::repl::OplogEntry> > > (mongo::OperationContext*, mongo::repl::OplogApplier::BatchLimits const&)>)::'lambda'()&&, std::tuple<>&&) /opt/mongodbtoolchain/revisions/0d5a071f1db663c050a1d7f330c13f46e62d6d4f/stow/gcc-v3.ulc/lib/gcc/x86_64-mongodb-linux/8.5.0/../../../../include/c++/8.5.0/tuple:1687
          #20 0x55adb31299b8 in mongo::stdx::thread::thread<mongo::repl::SyncTail::OpQueueBatcher::OpQueueBatcher(mongo::repl::SyncTail*, mongo::repl::StorageInterface*, mongo::repl::OplogBuffer*, std::function<mongo::StatusWith<std::vector<mongo::repl::OplogEntry, std::allocator<mongo::repl::OplogEntry> > > (mongo::OperationContext*, mongo::repl::OplogApplier::BatchLimits const&)>)::'lambda'(), 0>(mongo::repl::SyncTail::OpQueueBatcher::OpQueueBatcher(mongo::repl::SyncTail*, mongo::repl::StorageInterface*, mongo::repl::OplogBuffer*, std::function<mongo::StatusWith<std::vector<mongo::repl::OplogEntry, std::allocator<mongo::repl::OplogEntry> > > (mongo::OperationContext*, mongo::repl::OplogApplier::BatchLimits const&)>)::'lambda'()&&)::'lambda'()::operator()() /data/mci/42c8913417f9064a51396a1aded9cde0/src/src/mongo/stdx/thread.h:173
          #21 0x55adb7507cae in execute_native_thread_routine /data/mci/865082d961f6c894b0989cc920317fcd/toolchain-builder/tmp/build-gcc-v3.sh-56z/build/x86_64-mongodb-linux/libstdc++-v3/src/c++11/../../../../../src/combined/libstdc++-v3/src/c++11/thread.cc:80:18
          #22 0x7f9b2c7b96da in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76da)
      
      ...
      
      SUMMARY: AddressSanitizer: heap-use-after-free /data/mci/40ae47ae894f40074930da026dfd8cc7/src/src/third_party/wiredtiger/src/include/cursor.i:311:73 in __wt_cursor_dhandle_decr_use
      Shadow bytes around the buggy address:
        0x0c28800709c0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
        0x0c28800709d0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
        0x0c28800709e0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
        0x0c28800709f0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
        0x0c2880070a00: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
      =>0x0c2880070a10: fd fd fd fd fd fd fd fd fd fd[fd]fd fd fd fd fd
        0x0c2880070a20: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
        0x0c2880070a30: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
        0x0c2880070a40: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
        0x0c2880070a50: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
        0x0c2880070a60: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      Shadow byte legend (one shadow byte represents 8 application bytes):
        Addressable:           00
        Partially addressable: 01 02 03 04 05 06 07
        Heap left redzone:       fa
        Freed heap region:       fd
        Stack left redzone:      f1
        Stack mid redzone:       f2
        Stack right redzone:     f3
        Stack after return:      f5
        Stack use after scope:   f8
        Global redzone:          f9
        Global init order:       f6
        Poisoned by user:        f7
        Container overflow:      fc
        Array cookie:            ac
        Intra object redzone:    bb
        ASan internal:           fe
        Left alloca redzone:     ca
        Right alloca redzone:    cb
        Shadow gap:              cc
      ==120483==ABORTING
      

      The thread that originally free'd the cursor memory (sweep_server):

      freed by thread T11 here:
          #0 0x55adb2b6b3f2 in free /data/mci/dfb39050aa31abca2c40371c920c586e/toolchain-builder/tmp/build-llvm.sh-GjU/llvm/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:124:3
          #1 0x55adb35dd73a in __conn_dhandle_destroy /data/mci/42c8913417f9064a51396a1aded9cde0/src/src/third_party/wiredtiger/src/conn/conn_dhandle.c:152:5
          #2 0x55adb35e3bcc in __wt_conn_dhandle_discard_single /data/mci/42c8913417f9064a51396a1aded9cde0/src/src/third_party/wiredtiger/src/conn/conn_dhandle.c:798:9
          #3 0x55adb35fa106 in __sweep_remove_one /data/mci/42c8913417f9064a51396a1aded9cde0/src/src/third_party/wiredtiger/src/conn/conn_sweep.c:199:5
          #4 0x55adb35f752c in __sweep_remove_handles /data/mci/42c8913417f9064a51396a1aded9cde0/src/src/third_party/wiredtiger/src/conn/conn_sweep.c:237:13
          #5 0x55adb35f752c in __sweep_server /data/mci/42c8913417f9064a51396a1aded9cde0/src/src/third_party/wiredtiger/src/conn/conn_sweep.c:352
          #6 0x7f9b2c7b96da in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76da)

      The heap-use-after-free fault occurs during a session.reset as the following cache sweep condition is met:

      static int                                                                                                                                                                           
      __session_reset(WT_SESSION *wt_session)                                                                                                                                              
      {  
      ...
         if (--session->cursor_sweep_countdown == 0) {
              session->cursor_sweep_countdown = WT_SESSION_CURSOR_SWEEP_COUNTDOWN;
              WT_TRET(__wt_session_cursor_cache_sweep(session));
          }
      ...
      
      

      On initial inspection, we seem to be triggering a potential race condition between the sweep server and session_reset. Occuring as the __wt_session_cursor_cache_sweep function (called by session_reset)  attempts to de-reference a cached cursor that was concurrently free'd by the sweep server.

       

      Additional Context:

      This occured on a MongoDB 4.2 build, at commit 7853d994. The most recent WiredTiger drop commits surrounding this revision don't appear to make any significant modifications to sweeping logic, suggesting this bug has been present for some time. This may mean the bug may require specific timing requirements to trigger.

       

            Assignee:
            backlog-server-storage-engines [DO NOT USE] Backlog - Storage Engines Team
            Reporter:
            alison.felizzi@mongodb.com Alison Felizzi (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: