When stressing the convertToCapped codepath in suites with stepdowns, this fatal assertion is constantly hit due to DuplicateKey errors.
Purpose of this ticket is to assess whether it's a test-only issue (because in production that assertion is converted to a simple warning) or - otherwise - to identify the problem and fix it.
Reproducer
Simply force executing the convert_to_capped_collection FSM in a suite with stepdowns:
./buildscripts/resmoke.py run --suite concurrency_sharded_with_stepdowns --storageEngine=wiredTiger --jobs=1 --storageEngineCacheSizeGB=0.5 --dbpath=/tmp/testpath --runAllFeatureFlagTests --force-excluded-tests jstests/concurrency/fsm_workloads/convert_to_capped_collection.js
Logs from a failed execution
[j0:s1:n2] 2024-02-08T11:09:48.352+00:00 I REPL [BackgroundSync] Processing DDL command oplog entry in OplogBatcher{"oplogEntry":{"oplogEntry":{"op":"c","ns":"test0_fsmdb0.$cmd","ui":{ "$uuid":"491e153d-4706-480c-97cc-27d5ad54e1bd"},"o":{"renameCollection":"test0_fsmdb0.tmpdkL8n.convertToCapped.convert_to_capped_collection_5","to":"test0_fsmdb0.convert_to_capped_collectio n_5","stayTemp":false,"dropTarget":{"$uuid":"5802bbae-1d09-40df-bf71-26cf8b547cfc"}},"o2":{"numRecords":1200},"ts":{"$timestamp":{"t":1707390584,"i":7853}},"t":6,"v":2,"wall":{"$date":"2024 -02-08T11:09:44.531Z"}}}} [j0:s1:n2] 2024-02-08T11:09:48.354+00:00 F REPL [ReplWriterWorker-5] Error applying operation while recovering from stable checkpoint. This can lead to data corruption.{"oplogEntry":{" oplogEntry":{"op":"i","ns":"test0_fsmdb0.tmpJhbuV.convertToCapped.convert_to_capped_collection_8","ui":{"$uuid":"aa48d5a1-e825-40f2-aad9-70b6c0d0c64f"},"o":{"_id":{"$oid":"65c4b647eedaadff2 2434576"},"i":1181,"rand":0.752134560170723},"o2":{"_id":{"$oid":"65c4b647eedaadff22434576"}},"fromMigrate":true,"ts":{"$timestamp":{"t":1707390584,"i":6304}},"t":6,"v":2,"wall":{"$date":"2 024-02-08T11:09:44.367Z"}},"isForCappedCollection":true},"error":"DuplicateKey{ keyPattern: { _id: 1 }, keyValue: { _id: ObjectId('65c4b647eedaadff22434576') }, foundValue: 1182, duplicateR id: 1182 }: E11000 duplicate key error collection: test0_fsmdb0.tmpJhbuV.convertToCapped.convert_to_capped_collection_8 index: _id_ dup key: { _id: ObjectId('65c4b647eedaadff22434576') } fo und value: RecordId(1182)"} 62198 [j0:s1:n2] 2024-02-08T11:09:48.354+00:00 F ASSERT [ReplWriterWorker-5] Fatal assertion{"msgid":5415000,"file":"src/mongo/db/repl/oplog.cpp","line":1389} [j0:s1:n2] 2024-02-08T11:09:48.355+00:00 F ASSERT [ReplWriterWorker-5] ***aborting after fassert() failure [j0:s1:n2] 2024-02-08T11:09:48.355+00:00 F CONTROL [ReplWriterWorker-5] Writing fatal message{"message":"\n"} [j0:s1:n2] 2024-02-08T11:09:48.355+00:00 F CONTROL [ReplWriterWorker-5] Writing fatal message{"message":"Got signal: 6 (Aborted).\n"} [j0:s1:n2] 2024-02-08T11:09:48.446+00:00 I CONTROL [ReplWriterWorker-5] Frame: 0x7FA0BE70A980 funlockfile+0x50 [j0:s1:n2] 2024-02-08T11:09:48.446+00:00 I CONTROL [ReplWriterWorker-5] Frame: 0x7FA0BE345E87 gsignal+0xC7 [j0:s1:n2] 2024-02-08T11:09:48.446+00:00 I CONTROL [ReplWriterWorker-5] Frame: 0x7FA0BE3477F1 abort+0x141 [j0:s1:n2] 2024-02-08T11:09:48.446+00:00 I CONTROL [ReplWriterWorker-5] Frame: 0x7FA0C091E4FB mongo::(anonymous namespace)::callAbort()+0x1B [j0:s1:n2] 2024-02-08T11:09:48.446+00:00 I CONTROL [ReplWriterWorker-5] Frame: 0x7FA0C091ED82 mongo::fassertFailedWithLocation(int, char const*, unsigned int)+0x102 [j0:s1:n2] 2024-02-08T11:09:48.446+00:00 I CONTROL [ReplWriterWorker-5] Frame: 0x7FA0B9B9189F mongo::repl::OplogApplication::checkOnOplogFailureForRecovery(mongo::OperationContext*, mongo::NamespaceString const&, mongo::BSONObj const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x29F [j0:s1:n2] 2024-02-08T11:09:48.446+00:00 I CONTROL [ReplWriterWorker-5] Frame: 0x7FA0B9B956C6 mongo::repl::applyOperation_inlock(mongo::OperationContext*, mongo::CollectionAcquisition&, mongo::repl::OplogEntryOrGroupedInserts const&, bool, mongo::repl::OplogApplication::Mode, bool, std::function<void ()>)+0x38D6 [j0:s1:n2] 2024-02-08T11:09:48.446+00:00 I CONTROL [ReplWriterWorker-5] Frame: 0x7FA0BB2F1A70 mongo::repl::OplogApplierUtils::applyOplogEntryOrGroupedInsertsCommon(mongo::OperationContext*, mongo::repl::OplogEntryOrGroupedInserts const&, mongo::repl::OplogApplication::Mode, bool, std::function<void ()>, mongo::OpCounters*)::$_1::operator()() const+0xAD0 [j0:s1:n2] 2024-02-08T11:09:48.446+00:00 I CONTROL [ReplWriterWorker-5] Frame: 0x7FA0BB2EF17F mongo::repl::OplogApplierUtils::applyOplogEntryOrGroupedInsertsCommon(mongo::OperationContext*, mongo::repl::OplogEntryOrGroupedInserts const&, mongo::repl::OplogApplication::Mode, bool, std::function<void ()>, mongo::OpCounters*)+0x53F [j0:s1:n2] 2024-02-08T11:09:48.446+00:00 I CONTROL [ReplWriterWorker-5] Frame: 0x7FA0BB2DB91B mongo::repl::applyOplogEntryOrGroupedInserts(mongo::OperationContext*, mongo::repl::OplogEntryOrGroupedInserts const&, mongo::repl::OplogApplication::Mode, bool)+0x10B [j0:s1:n2] 2024-02-08T11:09:48.446+00:00 I CONTROL [ReplWriterWorker-5] Frame: 0x7FA0BB2EBC9E std::_Function_handler<mongo::Status(mongo::OperationContext*, mongo::repl::OplogEntryOrGroupedInserts const&, mongo::repl::OplogApplication::Mode, bool), mongo::Status (*)(mongo::OperationContext*, mongo::repl::OplogEntryOrGroupedInserts const&, mongo::repl::OplogApplication::Mode, bool)>::_M_invoke(std::_Any_data const&, mongo::OperationContext*&&, mongo::repl::OplogEntryOrGroupedInserts const&, mongo::repl::OplogApplication::Mode&&, bool&&)+0x1E [j0:s1:n2] 2024-02-08T11:09:48.446+00:00 I CONTROL [ReplWriterWorker-5] Frame: 0x7FA0BB2F01EF mongo::repl::OplogApplierUtils::applyOplogBatchCommon(mongo::OperationContext*, std::vector<mongo::repl::ApplierOperation, std::allocator<mongo::repl::ApplierOperation> >*, mongo::repl::OplogApplication::Mode, bool, bool, std::function<mongo::Status(mongo::OperationContext*, mongo::repl::OplogEntryOrGroupedInserts const&, mongo::repl::OplogApplication::Mode, bool)>)+0x24F [j0:s1:n2] 2024-02-08T11:09:48.446+00:00 I CONTROL [ReplWriterWorker-5] Frame: 0x7FA0BB2DC7B7 mongo::repl::OplogApplierImpl::applyOplogBatchPerWorker(mongo::OperationContext*, std::vector<mongo::repl::ApplierOperation, std::allocator<mongo::repl::ApplierOperation> >*, std::vector<mongo::MultikeyPathInfo, std::allocator<mongo::MultikeyPathInfo> >*, bool)+0xF7 [j0:s1:n2] 2024-02-08T11:09:48.446+00:00 I CONTROL [ReplWriterWorker-5] Frame: 0x7FA0BB2DE959 auto mongo::unique_function<void (mongo::Status)>::makeImpl<mongo::repl::OplogApplierImpl::_applyOplogBatch(mongo::OperationContext*, std::vector<mongo::repl::OplogEntry, std::allocator<mongo::repl::OplogEntry> >)::$_2>(mongo::repl::OplogApplierImpl::_applyOplogBatch(mongo::OperationContext*, std::vector<mongo::repl::OplogEntry, std::allocator<mongo::repl::OplogEntry> >)::$_2&&)::SpecificImpl::call(mongo::Status&&)+0x119 [j0:s1:n2] 2024-02-08T11:09:48.447+00:00 I CONTROL [ReplWriterWorker-5] Frame: 0x7FA0B3CF28F7 mongo::ThreadPool::Impl::_doOneTask(std::unique_lock<std::mutex>*)+0x1C7 [j0:s1:n2] 2024-02-08T11:09:48.447+00:00 I CONTROL [ReplWriterWorker-5] Frame: 0x7FA0B3CF1CC7 mongo::ThreadPool::Impl::_consumeTasks()+0xC7 [j0:s1:n2] 2024-02-08T11:09:48.447+00:00 I CONTROL [ReplWriterWorker-5] Frame: 0x7FA0B3CF1AB6 mongo::ThreadPool::Impl::_workerThreadBody(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x1A6 [j0:s1:n2] 2024-02-08T11:09:48.447+00:00 I CONTROL [ReplWriterWorker-5] Frame: 0x7FA0B3CF3085 std::thread::_State_impl<std::thread::_Invoker<std::tuple<mongo::stdx::thread::thread<mongo::ThreadPool::Impl::_startWorkerThread_inlock()::$_5, , 0>(&&)::{lambda()#1}> > >::_M_run()+0x65 [j0:s1:n2] 2024-02-08T11:09:48.447+00:00 I CONTROL [ReplWriterWorker-5] Frame: 0x7FA0BFDD2EE4 execute_native_thread_routine+0x14 [j0:s1:n2] 2024-02-08T11:09:48.447+00:00 I CONTROL [ReplWriterWorker-5] Frame: 0x7FA0BE6FF6DB start_thread+0xDB [j0:s1:n2] 2024-02-08T11:09:48.447+00:00 I CONTROL [ReplWriterWorker-5] Frame: 0x7FA0BE42861F clone+0x3F