Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-41010

getMinValid() in bgsync shouldn't conflict with PBWM lock

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major - P3
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: Backlog
    • Component/s: Replication
    • Labels:
      None
    • Operating System:
      ALL
    • Sprint:
      Repl 2019-06-17, Repl 2019-07-01
    • Linked BF Score:
      33

      Description

      On secondary, rsBackgroundSync makes a new operation context to get minValid, which conflicts with the PBWM lock. In steady state, the applier is the only writer of minValid by advancing the minValid after writing the oplog. Bgsync reads the minValid to choose sync source if minValid is ahead of the last fetched optime. This read of minValid doesn't have to be synchronized to the boundary of batches due to PBWM lock.

      Thread 54: "rsBackgroundSync" (Thread 0x7e983ac4ee60 (LWP 16775))
      #0  0x00007e9857b82b94 in pthread_cond_timedwait@@GLIBC_2.17 () from /lib/powerpc64le-linux-gnu/libpthread.so.0
      #1  0x00000b5b64fa1e68 in __gthread_cond_timedwait (__abs_timeout=0x7e983ac4cb68, __mutex=<optimized out>, __cond=<optimized out>) at /opt/mongodbtoolchain/revisions/94dac13bc8c0b50beff286acac77adeb2e81761e/stow/gcc-v3.tmq/include/c++/8.2.0/ppc64le-mongodb-linux/bits/gthr-default.h:871
      #2  std::condition_variable::__wait_until_impl<std::chrono::duration<long, std::ratio<1l, 1000000000l> > > (__atime=..., __lock=<synthetic pointer>..., this=0xb5b909d72d0) at /opt/mongodbtoolchain/revisions/94dac13bc8c0b50beff286acac77adeb2e81761e/stow/gcc-v3.tmq/include/c++/8.2.0/condition_variable:178
      #3  std::condition_variable::wait_until<std::chrono::duration<long, std::ratio<1l, 1000000000l> > > (__atime=..., __lock=<synthetic pointer>..., this=0xb5b909d72d0) at /opt/mongodbtoolchain/revisions/94dac13bc8c0b50beff286acac77adeb2e81761e/stow/gcc-v3.tmq/include/c++/8.2.0/condition_variable:106
      #4  std::condition_variable::wait_until<std::chrono::_V2::system_clock, std::chrono::duration<long int, std::ratio<1, 1000000000> >, mongo::CondVarLockGrantNotification::wait(mongo::Milliseconds)::<lambda()> > (__p=..., __atime=..., __lock=<synthetic pointer>..., this=0xb5b909d72d0) at /opt/mongodbtoolchain/revisions/94dac13bc8c0b50beff286acac77adeb2e81761e/stow/gcc-v3.tmq/include/c++/8.2.0/condition_variable:129
      #5  std::condition_variable::wait_for<long int, std::ratio<1, 1000000000>, mongo::CondVarLockGrantNotification::wait(mongo::Milliseconds)::<lambda()> > (__p=..., __rtime=..., __lock=<synthetic pointer>..., this=0xb5b909d72d0) at /opt/mongodbtoolchain/revisions/94dac13bc8c0b50beff286acac77adeb2e81761e/stow/gcc-v3.tmq/include/c++/8.2.0/condition_variable:156
      #6  mongo::CondVarLockGrantNotification::wait (this=this@entry=0xb5b909d72a0, timeout=..., timeout@entry=...) at src/mongo/db/concurrency/lock_state.cpp:213
      #7  0x00000b5b64fa3eec in mongo::LockerImpl::lockComplete (this=0xb5b909d7200, opCtx=0x0, resId=..., mode=<optimized out>, deadline=...) at src/mongo/db/concurrency/lock_state.cpp:867
      #8  0x00000b5b64fa67a0 in mongo::LockerImpl::lock (this=<optimized out>, resId=..., mode=<optimized out>, deadline=...) at src/mongo/db/concurrency/lock_state.h:173
      #9  0x00000b5b64f9155c in mongo::Lock::ResourceLock::lock (this=0x7e983ac4d290, mode=<optimized out>) at src/mongo/db/concurrency/d_concurrency.cpp:343
      #10 0x00000b5b64f9172c in mongo::Lock::GlobalLock::_enqueue (this=this@entry=0x7e983ac4d280, lockMode=lockMode@entry=mongo::MODE_IS, deadline=...) at src/mongo/db/concurrency/d_concurrency.cpp:181
      #11 0x00000b5b64f918e8 in mongo::Lock::GlobalLock::GlobalLock (this=0x7e983ac4d280, opCtx=<optimized out>, lockMode=<optimized out>, deadline=..., behavior=<optimized out>, enqueueOnly=...) at src/mongo/db/concurrency/d_concurrency.cpp:158
      #12 0x00000b5b64f91974 in mongo::Lock::GlobalLock::GlobalLock (this=0x7e983ac4d280, opCtx=<optimized out>, lockMode=<optimized out>, deadline=..., behavior=<optimized out>) at src/mongo/db/concurrency/d_concurrency.cpp:140
      #13 0x00000b5b64f91a74 in mongo::Lock::DBLock::DBLock (this=0x7e983ac4d268, opCtx=0xb5b91332940, db="local", mode=<optimized out>, deadline=...) at src/mongo/db/concurrency/lock_manager_defs.h:101
      #14 0x00000b5b642664e4 in mongo::AutoGetDb::AutoGetDb (this=<optimized out>, opCtx=<optimized out>, dbName=..., mode=<optimized out>, deadline=...) at src/mongo/db/catalog_raii.cpp:55
      #15 0x00000b5b6426732c in mongo::AutoGetCollection::AutoGetCollection (this=0x7e983ac4d268, opCtx=0xb5b91332940, nsOrUUID=..., modeDB=<optimized out>, modeColl=<optimized out>, viewMode=<optimized out>, deadline=...) at src/mongo/base/string_data.h:61
      #16 0x00000b5b62d5f99c in mongo::AutoGetCollection::AutoGetCollection (deadline=..., viewMode=mongo::AutoGetCollection::kViewsForbidden, modeAll=<optimized out>, nsOrUUID=..., opCtx=<error reading variable: value has been optimized out>, this=<optimized out>) at src/mongo/db/catalog_raii.h:91
      #17 mongo::repl::(anonymous namespace)::<lambda()>::operator()(void) const (__closure=__closure@entry=0x7e983ac4d3c8) at src/mongo/db/repl/storage_interface_impl.cpp:606
      #18 0x00000b5b62d60b28 in mongo::writeConflictRetry<mongo::repl::(anonymous namespace)::_findOrDeleteDocuments(mongo::OperationContext*, const mongo::NamespaceStringOrUUID&, boost::optional<mongo::StringData>, mongo::repl::StorageInterface::ScanDirection, const mongo::BSONObj&, const mongo::BSONObj&, mongo::BoundInclusion, std::size_t, mongo::repl::(anonymous namespace)::FindDeleteMode)::<lambda()> > (f=..., ns=..., opStr=..., opCtx=0xb5b91332940) at /opt/mongodbtoolchain/revisions/94dac13bc8c0b50beff286acac77adeb2e81761e/stow/gcc-v3.tmq/include/c++/8.2.0/bits/atomic_base.h:390
      #19 mongo::repl::(anonymous namespace)::_findOrDeleteDocuments (opCtx=<optimized out>, opCtx@entry=0xb5b91332940, nsOrUUID=..., indexName=..., scanDirection=<optimized out>, scanDirection@entry=mongo::repl::StorageInterface::ScanDirection::kForward, startKey=unowned empty BSONObj @ 0xb5b65398a10 <mongo::BSONObj::BSONObj()::kEmptyObjectPrototype>, endKey=<error reading variable: Cannot access memory at address 0x0>..., boundInclusion=<optimized out>, limit=<optimized out>, limit@entry=0, mode=mode@entry=mongo::repl::(anonymous namespace)::FindDeleteMode::kFind) at src/mongo/db/repl/storage_interface_impl.cpp:712
      #20 0x00000b5b62d698cc in mongo::repl::StorageInterfaceImpl::findDocuments (this=<optimized out>, opCtx=0xb5b91332940, nss=..., indexName=..., scanDirection=<optimized out>, startKey=unowned empty BSONObj @ 0xb5b65398a10 <mongo::BSONObj::BSONObj()::kEmptyObjectPrototype>, boundInclusion=<optimized out>, limit=2) at src/mongo/bson/bsonobj.h:128
      #21 0x00000b5b62d5d828 in mongo::repl::StorageInterfaceImpl::findSingleton (this=<optimized out>, opCtx=<optimized out>, nss=...) at src/mongo/bson/bsonobj.h:128
      #22 0x00000b5b62da0380 in mongo::repl::ReplicationConsistencyMarkersImpl::_getMinValidDocument (this=<error reading variable: value has been optimized out>, opCtx=<optimized out>) at src/mongo/db/repl/replication_consistency_markers_impl.cpp:74
      #23 0x00000b5b62da09bc in mongo::repl::ReplicationConsistencyMarkersImpl::getMinValid (this=<optimized out>, opCtx=<optimized out>) at src/mongo/db/repl/replication_consistency_markers_impl.cpp:179
      #24 0x00000b5b62e3361c in mongo::repl::BackgroundSync::_produce (this=this@entry=0xb5b8b357e00) at /opt/mongodbtoolchain/revisions/94dac13bc8c0b50beff286acac77adeb2e81761e/stow/gcc-v3.tmq/include/c++/8.2.0/bits/unique_ptr.h:342
      #25 0x00000b5b62e35564 in mongo::repl::BackgroundSync::_runProducer (this=this@entry=0xb5b8b357e00) at src/mongo/db/repl/bgsync.cpp:213
      #26 0x00000b5b62e357ac in mongo::repl::BackgroundSync::_run (this=0xb5b8b357e00) at src/mongo/db/repl/bgsync.cpp:174
      #27 0x00000b5b65377694 in std::execute_native_thread_routine (__p=<optimized out>) at ../../../../../src/combined/libstdc++-v3/src/c++11/thread.cc:80
      #28 0x00007e9857b7885c in start_thread () from /lib/powerpc64le-linux-gnu/libpthread.so.0
      #29 0x00007e9857a99028 in clone () from /lib/powerpc64le-linux-gnu/libc.so.6
      

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              backlog-server-repl Backlog - Replication Team
              Reporter:
              siyuan.zhou Siyuan Zhou
              Participants:
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Dates

                Created:
                Updated: