Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-57476

Operation may block on prepare conflict while holding oplog slot, stalling replication indefinitely

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: 4.2.0, 4.4.0, 5.0.0-rc0
    • Fix Version/s: 5.1.0, 4.2.15, 4.4.7, 5.0.0-rc2
    • Component/s: None
    • Labels:
      None
    • Backwards Compatibility:
      Fully Compatible
    • Operating System:
      ALL
    • Backport Requested:
      v5.0, v4.4, v4.2
    • Sprint:
      Repl 2021-06-14
    • Case:

      Description

      wiredTigerPrepareConflictRetry() doesn't release any storage resources when WiredTiger returns WT_PREPARE_CONFLICT. This is because prepare conflicts are expected to resolve quickly and the storage transaction may already have a snapshot compatible with the commitTimestamp of the prepared transaction. However, committing or aborting a prepared transaction depends on the ability of the replica set to replicate and confirm majority-committed writes. If the operation blocked in wiredTigerPrepareConflictRetry() has reserved an oplog slot, then oplog readers will block until the oplog hole is resolved. But oplog readers being blocked prevents new incoming writes from becoming majority-committed, which can prevent a prepared transaction from exiting the kPrepared state.


      Thread 56: "conn2" (Thread 0x7f4ee3799700 (LWP 15315))
      ...
      #11 mongo::Interruptible::waitForConditionOrInterrupt<std::unique_lock<mongo::latch_detail::Latch>, mongo::WiredTigerSessionCache::waitUntilPreparedUnitOfWorkCommitsOrAborts(mongo::OperationContext*, uint64_t)::<lambda()> > (waitTimer=0x0, pred=..., m=..., cv=..., this=0x55802d3d26c0) at src/mongo/util/interruptible.h:466
      #12 mongo::WiredTigerSessionCache::waitUntilPreparedUnitOfWorkCommitsOrAborts (this=<optimized out>, opCtx=opCtx@entry=0x55802d3d26c0, lastCount=<optimized out>, lastCount@entry=0) at src/mongo/db/storage/wiredtiger/wiredtiger_session_cache.cpp:369
      #13 0x00007f4f0c17c5e1 in mongo::wiredTigerPrepareConflictRetry<mongo::WiredTigerIndexUnique::_keyExists(mongo::OperationContext*, WT_CURSOR*, char const*, size_t)::<lambda()> >(mongo::OperationContext *, mongo::WiredTigerIndexUnique::<lambda()> &&) (opCtx=0x55802d3d26c0, f=...) at src/mongo/db/storage/wiredtiger/wiredtiger_recovery_unit.h:192
      #14 0x00007f4f0c17ce24 in mongo::WiredTigerIndexUnique::_keyExists (this=this@entry=0x55802cba1380, opCtx=opCtx@entry=0x55802d3d26c0, c=<optimized out>, c@entry=0x55802d8d1400, buffer=0x55802d43bc38 "+\002\004", size=size@entry=3) at src/mongo/db/storage/wiredtiger/wiredtiger_index.cpp:1400
      #15 0x00007f4f0c17d0da in mongo::WiredTigerIndexUnique::_insertTimestampSafe (this=0x55802cba1380, opCtx=0x55802d3d26c0, c=0x55802d8d1400, keyString=..., dupsAllowed=<optimized out>) at src/mongo/util/shared_buffer.h:204
      #16 0x00007f4f0c17e2dc in mongo::WiredTigerIndexUnique::_insert (this=0x55802cba1380, opCtx=0x55802d3d26c0, c=0x55802d8d1400, keyString=..., dupsAllowed=<optimized out>) at src/mongo/db/storage/wiredtiger/wiredtiger_index.cpp:1474
      #17 0x00007f4f0c178b50 in mongo::WiredTigerIndex::insert (this=0x55802cba1380, opCtx=0x55802d3d26c0, keyString=..., dupsAllowed=<optimized out>) at src/mongo/db/storage/wiredtiger/wiredtiger_index.cpp:267
      #18 0x00007f4f096657d0 in mongo::AbstractIndexAccessMethod::insertKeys(mongo::OperationContext*, std::set<mongo::KeyString::Value, std::less<mongo::KeyString::Value>, std::allocator<mongo::KeyString::Value> > const&, mongo::RecordId const&, mongo::InsertDeleteOptions const&, std::function<mongo::Status (mongo::KeyString::Value const&)>&&, long*) (this=0x55802cc25440, opCtx=0x55802d3d26c0, keys=std::set with 1 element = {...}, loc=..., options=..., onDuplicateKey=..., numInserted=0x7f4ee3795510) at /opt/mongodbtoolchain/revisions/e5348beb43e147b74a40f4ca5fb05a330ea646cf/stow/gcc-v3.U0D/include/c++/8.2.0/bits/unique_ptr.h:342
      #19 0x00007f4f0966596e in mongo::AbstractIndexAccessMethod::insertKeysAndUpdateMultikeyPaths(mongo::OperationContext*, std::set<mongo::KeyString::Value, std::less<mongo::KeyString::Value>, std::allocator<mongo::KeyString::Value> > const&, std::set<mongo::KeyString::Value, std::less<mongo::KeyString::Value>, std::allocator<mongo::KeyString::Value> > const&, std::vector<std::set<unsigned long, std::less<unsigned long>, std::allocator<unsigned long> >, std::allocator<std::set<unsigned long, std::less<unsigned long>, std::allocator<unsigned long> > > > const&, mongo::RecordId const&, mongo::InsertDeleteOptions const&, std::function<mongo::Status (mongo::KeyString::Value const&)>&&, long*) (this=0x55802cc25440, opCtx=0x55802d3d26c0, keys=std::set with 1 element = {...}, multikeyMetadataKeys=std::set with 0 elements, multikeyPaths=std::vector of length 1, capacity 1 = {...}, loc=..., options=..., onDuplicateKey=..., numInserted=0x7f4ee3795510) at src/mongo/db/index/index_access_method.cpp:164
      #20 0x00007f4f0a2bae14 in mongo::IndexCatalogImpl::_indexKeys (this=<optimized out>, opCtx=0x55802d3d26c0, index=<optimized out>, keys=std::set with 1 element = {...}, multikeyMetadataKeys=..., multikeyPaths=std::vector of length 1, capacity 1 = {...}, obj=owned BSONObj 33 bytes @ 0x55802d35bf2b = {...}, loc=..., options=..., keysInsertedOut=0x7f4ee37958e8) at /opt/mongodbtoolchain/revisions/e5348beb43e147b74a40f4ca5fb05a330ea646cf/stow/gcc-v3.U0D/include/c++/8.2.0/bits/std_function.h:402
      #21 0x00007f4f0a2c2bce in mongo::IndexCatalogImpl::_indexFilteredRecords (this=0x55802cc26340, opCtx=0x55802d3d26c0, index=0x55802d35b810, bsonRecords=..., keysInsertedOut=0x7f4ee37958e8) at src/mongo/db/catalog/index_catalog_impl.cpp:1408
      #22 0x00007f4f0a2c2f34 in mongo::IndexCatalogImpl::_indexRecords (this=0x55802cc26340, opCtx=0x55802d3d26c0, index=0x55802d35b810, bsonRecords=std::vector of length 3, capacity 3 = {...}, keysInsertedOut=0x7f4ee37958e8) at src/mongo/db/catalog/index_catalog_impl.cpp:1435
      #23 0x00007f4f0a2c3032 in mongo::IndexCatalogImpl::indexRecords (this=0x55802cc26340, opCtx=0x55802d3d26c0, bsonRecords=std::vector of length 3, capacity 3 = {...}, keysInsertedOut=0x7f4ee37958e8) at src/mongo/db/catalog/index_catalog_impl.cpp:1591
      #24 0x00007f4f0a29baab in mongo::CollectionImpl::_insertDocuments (this=0x55802d35b100, opCtx=0x55802d3d26c0, begin={stmtId = 0, oplogSlot = {_timestamp = {i = 1, secs = 536870912}, _term = 1}, doc = <error reading variable: Cannot access memory at address 0x4000000000000001>{_objdata = 0x4000000000000001 <error: Cannot access memory at address 0x4000000000000001>, _ownedBuffer = {_buffer = {_holder = {px = 0x2}}}}}, end={stmtId = 0, oplogSlot = {_timestamp = {i = 1, secs = 536870912}, _term = 1}, doc = <error reading variable: Cannot access memory at address 0x4000000000000001>{_objdata = 0x4000000000000001 <error: Cannot access memory at address 0x4000000000000001>, _ownedBuffer = {_buffer = {_holder = {px = 0x2}}}}}, opDebug=0x55802d359450) at /opt/mongodbtoolchain/revisions/e5348beb43e147b74a40f4ca5fb05a330ea646cf/stow/gcc-v3.U0D/include/c++/8.2.0/bits/unique_ptr.h:342
      #25 0x00007f4f0a2a034e in mongo::CollectionImpl::insertDocuments (this=this@entry=0x55802d35b100, opCtx=opCtx@entry=0x55802d3d26c0, begin=begin@entry={stmtId = -1, oplogSlot = {_timestamp = {i = 5, secs = 1622990608}, _term = 1}, doc = owned BSONObj 33 bytes @ 0x55802d35bf2b = {[_id] = {"$oid": "60bcdf109bb2288d9de42896"}, [a] = 1.0}}, end=end@entry={stmtId = 0, oplogSlot = {_timestamp = {i = 1, secs = 536870912}, _term = 1}, doc = <error reading variable: Cannot access memory at address 0x4000000000000001>{_objdata = 0x4000000000000001 <error: Cannot access memory at address 0x4000000000000001>, _ownedBuffer = {_buffer = {_holder = {px = 0x2}}}}}, opDebug=0x55802d359450, fromMigrate=false) at src/mongo/db/catalog/collection_impl.cpp:505
      #26 0x00007f4f0aca3963 in mongo::(anonymous namespace)::insertDocuments (opCtx=0x55802d3d26c0, collection=<optimized out>, begin=..., end={stmtId = 0, oplogSlot = {_timestamp = {i = 1, secs = 536870912}, _term = 1}, doc = <error reading variable: Cannot access memory at address 0x4000000000000001>{_objdata = 0x4000000000000001 <error: Cannot access memory at address 0x4000000000000001>, _ownedBuffer = {_buffer = {_holder = {px = 0x2}}}}}, fromMigrate=<optimized out>) at /opt/mongodbtoolchain/revisions/e5348beb43e147b74a40f4ca5fb05a330ea646cf/stow/gcc-v3.U0D/include/c++/8.2.0/bits/stl_iterator.h:845
      #27 0x00007f4f0aca45a1 in mongo::(anonymous namespace)::insertBatchAndHandleErrors (fromMigrate=<optimized out>, out=<optimized out>, lastOpFixer=<optimized out>, batch=std::vector of length 3, capacity 3 = {...}, wholeOp=..., opCtx=<optimized out>) at src/mongo/db/catalog_raii.h:131
      #28 mongo::performInserts (opCtx=<optimized out>, opCtx@entry=0x55802d3d26c0, wholeOp=..., fromMigrate=fromMigrate@entry=false) at src/mongo/db/ops/write_ops_exec.cpp:621
      #29 0x00007f4f04442cef in mongo::(anonymous namespace)::CmdInsert::Invocation::runImpl (this=0x55802d4090c0, opCtx=0x55802d3d26c0, result=...) at src/mongo/db/commands/write_commands/write_commands.cpp:299
      #30 0x00007f4f044403dd in mongo::(anonymous namespace)::WriteCommand::InvocationBase::run (this=0x55802d4090c0, opCtx=0x55802d3d26c0, result=0x55802cf10cc0) at src/mongo/db/commands/write_commands/write_commands.cpp:241
      #31 0x00007f4f08db14ff in mongo::CommandHelpers::runCommandInvocation (opCtx=0x55802d3d26c0, request=..., invocation=0x55802d4090c0, response=0x55802cf10cc0) at src/mongo/db/commands.cpp:184
      #32 0x00007f4f0469364f in mongo::(anonymous namespace)::runCommandImpl (opCtx=<optimized out>, invocation=<optimized out>, request=..., replyBuilder=0x55802cf10cc0, startOperationTime=..., behaviors=..., extraFieldsBuilder=0x7f4ee37968e0, sessionOptions=...) at src/mongo/db/service_entry_point_common.cpp:828
      #33 0x00007f4f04695cf9 in mongo::(anonymous namespace)::execCommandDatabase (opCtx=<optimized out>, command=<optimized out>, request=..., replyBuilder=<optimized out>, behaviors=...) at src/mongo/db/service_entry_point_common.cpp:1226
      #34 0x00007f4f04697070 in mongo::(anonymous namespace)::<lambda()>::operator() (__closure=<optimized out>) at /opt/mongodbtoolchain/revisions/e5348beb43e147b74a40f4ca5fb05a330ea646cf/stow/gcc-v3.U0D/include/c++/8.2.0/bits/unique_ptr.h:342
      #35 mongo::(anonymous namespace)::receivedCommands (opCtx=0x55802d3d26c0, message=..., behaviors=...) at src/mongo/db/service_entry_point_common.cpp:1440
      #36 0x00007f4f04697cfd in mongo::ServiceEntryPointCommon::handleRequest (opCtx=0x55802d3d26c0, m=..., behaviors=...) at src/mongo/db/service_entry_point_common.cpp:1745
      

      Thread 43: "conn1" (Thread 0x7f4ee9cbf700 (LWP 15300))
      ...
      #11 mongo::Interruptible::waitForConditionOrInterrupt<std::unique_lock<mongo::latch_detail::Latch>, mongo::WiredTigerOplogManager::waitForAllEarlierOplogWritesToBeVisible(const mongo::WiredTigerRecordStore*, mongo::OperationContext*)::<lambda()> > (waitTimer=0x0, pred=..., m=..., cv=..., this=0x55802d3d18c0) at src/mongo/util/interruptible.h:466
      #12 mongo::WiredTigerOplogManager::waitForAllEarlierOplogWritesToBeVisible (this=<optimized out>, oplogRecordStore=<optimized out>, opCtx=0x55802d3d18c0) at src/mongo/db/storage/wiredtiger/wiredtiger_oplog_manager.cpp:147
      #13 0x00007f4f09ea2244 in mongo::CollectionScan::doWork (this=0x55802cb75c00, out=0x7f4ee9cbb8c8) at src/mongo/db/exec/plan_stage.h:383
      #14 0x00007f4f09ec3316 in mongo::PlanStage::work (this=0x55802cb75c00, out=out@entry=0x7f4ee9cbb8c8) at src/mongo/db/exec/plan_stage.cpp:47
      #15 0x00007f4f09f20f7a in mongo::PlanExecutorImpl::_getNextImpl (this=0x55802d379960, objOut=0x7f4ee9cbba30, dlOut=0x0) at /opt/mongodbtoolchain/revisions/e5348beb43e147b74a40f4ca5fb05a330ea646cf/stow/gcc-v3.U0D/include/c++/8.2.0/bits/unique_ptr.h:342
      #16 0x00007f4f09f219bb in mongo::PlanExecutorImpl::getNext (this=<optimized out>, objOut=0x7f4ee9cbbb28, dlOut=<optimized out>) at src/mongo/db/query/plan_executor_impl.cpp:413
      #17 0x00007f4f0440f753 in mongo::(anonymous namespace)::FindCmd::Invocation::run (this=0x55802cf10680, opCtx=0x55802d3d18c0, result=<optimized out>) at src/mongo/db/commands/find_cmd.cpp:516
      #18 0x00007f4f08db14ff in mongo::CommandHelpers::runCommandInvocation (opCtx=0x55802d3d18c0, request=..., invocation=0x55802cf10680, response=0x55802cf104a0) at src/mongo/db/commands.cpp:184
      #19 0x00007f4f04692d72 in mongo::(anonymous namespace)::runCommandImpl (opCtx=<optimized out>, invocation=<optimized out>, request=..., replyBuilder=0x55802cf104a0, startOperationTime=..., behaviors=..., extraFieldsBuilder=0x7f4ee9cbc8e0, sessionOptions=...) at src/mongo/db/service_entry_point_common.cpp:859
      #20 0x00007f4f04695cf9 in mongo::(anonymous namespace)::execCommandDatabase (opCtx=<optimized out>, command=<optimized out>, request=..., replyBuilder=<optimized out>, behaviors=...) at src/mongo/db/service_entry_point_common.cpp:1226
      #21 0x00007f4f04697070 in mongo::(anonymous namespace)::<lambda()>::operator() (__closure=<optimized out>) at /opt/mongodbtoolchain/revisions/e5348beb43e147b74a40f4ca5fb05a330ea646cf/stow/gcc-v3.U0D/include/c++/8.2.0/bits/unique_ptr.h:342
      #22 mongo::(anonymous namespace)::receivedCommands (opCtx=0x55802d3d18c0, message=..., behaviors=...) at src/mongo/db/service_entry_point_common.cpp:1440
      #23 0x00007f4f04697cfd in mongo::ServiceEntryPointCommon::handleRequest (opCtx=0x55802d3d18c0, m=..., behaviors=...) at src/mongo/db/service_entry_point_common.cpp:1745
      

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              daniel.gottlieb Daniel Gottlieb
              Reporter:
              max.hirschhorn Max Hirschhorn
              Participants:
              Votes:
              0 Vote for this issue
              Watchers:
              24 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: