[SERVER-27362] MapReduce can attempt to deregister a cursor without holding a collection lock Created: 09/Dec/16  Updated: 05/Apr/17  Resolved: 19/Dec/16

Status: Closed
Project: Core Server
Component/s: MapReduce
Affects Version/s: 3.4.0
Fix Version/s: 3.4.3, 3.5.2

Type: Bug Priority: Major - P3
Reporter: Charlie Swanson Assignee: Charlie Swanson
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Related
related to SERVER-24196 mapReduce command can unpin a ClientC... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v3.4
Sprint: Query 2017-01-23
Participants:

 Description   

There is code that attempts to prevent this from happening, but it's subtly broken. During the Map/Reduce command we construct a RangePreserver to prevent migrated chunks from being deleted out from under us. The RangePreserver will create a ClientCursor on the collection, preventing the RangeDeleter from deleting migrated ranges. In order to destroy a RangeDeleter you need to hold the collection lock in at least MODE_IS. Destroying a RangePreserver will attempt to delete and de-register its ClientCursor, and thus will need to access the CursorManager. The lock must be held to prevent the CursorManager from being destroyed while its being accessed (the CursorManager is deleted when a Collection is deleted, e.g. by a drop). We attempt to ensure this is the case by using the ON_BLOCK_EXIT macro to acquire a lock and clean up the ClientCursorPin:

        // Prevent sharding state from changing during the MR.
        unique_ptr<RangePreserver> rangePreserver;
        ScopedCollectionMetadata collMetadata;
        {
            AutoGetCollectionForRead ctx(txn, config.ns);
 
            Collection* collection = ctx.getCollection();
            if (collection) {
                rangePreserver.reset(new RangePreserver(collection));
            }
 
            // Get metadata before we check our version, to make sure it doesn't increment
            // in the meantime.  Need to do this in the same lock scope as the block.
            if (ShardingState::get(txn)->needCollectionMetadata(txn, config.ns)) {
                collMetadata = CollectionShardingState::get(txn, config.ns)->getMetadata();
            }
        }
 
        // Ensure that the RangePreserver is freed under the lock. This is necessary since the
        // RangePreserver's destructor unpins a ClientCursor, and access to the CursorManager must
        // be done under the lock.
        ON_BLOCK_EXIT([txn, &config, &rangePreserver] {
            if (rangePreserver) {
                AutoGetCollectionForRead ctx(txn, config.ns);
                rangePreserver.reset();
            }
        });

This almost works, but AutoGetCollectionForRead does more than just acquire locks, it also can throw an exception. When this happens, rangePreserver is not reset in the ON_BLOCK_EXIT block, and instead the unique_ptr is destroyed, deleting the ClientCursorPin without a lock.

This can be seen in this backtrace from one of my patch builds. In this patch, I did some work to allow the ClientCursorPin to hold a reference to an OperationContext, and then I added this dassert inside ClientCursorPin::deregisterCursor():

@@ -236,8 +241,11 @@ void ClientCursorPin::deleteUnderlying() {
     //   we can't simply unpin with the cursor manager lock here, since we need to guarantee
     //   exclusive ownership of the cursor when we are deleting it).
     if (_cursor->cursorManager()) {
+        dassert(_cursor->cursorManager()->getNss().ns().empty() ||
+                _opCtx->lockState()->isCollectionLockedForMode(
+                    _cursor->cursorManager()->getNss().ns(), MODE_IS));
         _cursor->cursorManager()->deregisterCursor(_cursor);
-        _cursor->kill();
+        _cursor->kill(_opCtx, "cursor exhausted");
     }
     _cursor->unsetPinned();
     delete _cursor;

With this applied, the tests jstests/sharding/mapReduce_inSharded_outSharded and jstests/sharding/zBigMapReduce.js both failed on debug builds with this stack trace:

[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.645+0000 d22261| 2016-12-08T04:55:20.646+0000 I -        [conn6] Invariant failure _cursor->cursorManager()->getNss().ns().empty() || _opCtx->lockState()->isCollectionLockedForMode( _cursor->cursorManager()->getNss().ns(), MODE_IS) src\mongo\db\clientcursor.cpp 246
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.645+0000 d22261| 2016-12-08T04:55:20.646+0000 I -        [conn6]
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.645+0000 d22261|
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.645+0000 d22261| ***aborting after invariant() failure
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.645+0000 d22261|
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.645+0000 d22261|
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.684+0000 d22261| 2016-12-08T04:55:20.684+0000 I SHARDING [conn5] MetadataLoader loading chunks for mrShard.srcSharded based on: (empty)
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.688+0000 d22261| 2016-12-08T04:55:20.689+0000 I SHARDING [conn5] Refreshing metadata for collection mrShard.srcSharded from collection version: 4|1||5848e79d4d151577c1e50900, shard version: 4|1||5848e79d4d151577c1e50900 to collection version: 5|1||5848e79d4d151577c1e50900, shard version: 5|1||5848e79d4d151577c1e50900
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.688+0000 d22261| 2016-12-08T04:55:20.689+0000 I SHARDING [conn5] MetadataLoader took 4 ms and found version 5|1||5848e79d4d151577c1e50900
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.688+0000 d22261| 2016-12-08T04:55:20.689+0000 I SHARDING [conn5] Migration succeeded and updated collection version to 5|1||5848e79d4d151577c1e50900
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.690+0000 d22261| 2016-12-08T04:55:20.690+0000 I SHARDING [conn5] about to log metadata event into changelog: { _id: "WIN-K6DURTQEFVN-2016-12-08T04:55:20.690+0000-5848e7b831d7d08e8550c61b", server: "WIN-K6DURTQEFVN", clientAddr: "10.63.93.217:50717", time: new Date(1481172920690), what: "moveChunk.commit", ns: "mrShard.srcSharded", details: { min: { _id: ObjectId('5848e79fadaa73f5c48b9144') }, max: { _id: ObjectId('5848e79fadaa73f5c48b930f') }, from: "shard0001", to: "shard0000" } }
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.709+0000 d22261| 2016-12-08T04:55:20.708+0000 I SHARDING [conn5] forking for cleanup of chunk data
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.709+0000 d22261| 2016-12-08T04:55:20.709+0000 I SHARDING [conn5] waiting for open cursors before removing range [{ _id: ObjectId('5848e79fadaa73f5c48b9144') }, { _id: ObjectId('5848e79fadaa73f5c48b930f') }) in mrShard.srcSharded, cursor ids: [46841715673]
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.710+0000 d22261| 2016-12-08T04:55:20.709+0000 I SHARDING [conn5] about to log metadata event into changelog: { _id: "WIN-K6DURTQEFVN-2016-12-08T04:55:20.709+0000-5848e7b831d7d08e8550c61d", server: "WIN-K6DURTQEFVN", clientAddr: "10.63.93.217:50717", time: new Date(1481172920709), what: "moveChunk.from", ns: "mrShard.srcSharded", details: { min: { _id: ObjectId('5848e79fadaa73f5c48b9144') }, max: { _id: ObjectId('5848e79fadaa73f5c48b930f') }, step 1 of 7: 0, step 2 of 7: 7, step 3 of 7: 32, step 4 of 7: 258, step 5 of 7: 105, step 6 of 7: 51, step 7 of 7: 0, to: "shard0000", from: "shard0001", note: "success" } }
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.729+0000 d22261| 2016-12-08T04:55:20.729+0000 I COMMAND  [conn5] command admin.$cmd appName: "MongoDB Shell" command: moveChunk { moveChunk: "mrShard.srcSharded", shardVersion: [ Timestamp 4000|1, ObjectId('5848e79d4d151577c1e50900') ], epoch: ObjectId('5848e79d4d151577c1e50900'), configdb: "test-configRS/WIN-K6DURTQEFVN:22262,WIN-K6DURTQEFVN:22263,WIN-K6DURTQEFVN:22264", fromShard: "shard0001", toShard: "shard0000", min: { _id: ObjectId('5848e79fadaa73f5c48b9144') }, max: { _id: ObjectId('5848e79fadaa73f5c48b930f') }, chunkVersion: [ Timestamp 4000|1, ObjectId('5848e79d4d151577c1e50900') ], maxChunkSizeBytes: 1048576, waitForDelete: false, takeDistLock: false } numYields:3 reslen:74 locks:{ Global: { acquireCount: { r: 23, w: 7 } }, MMAPV1Journal: { acquireCount: { r: 8, w: 11 } }, Database: { acquireCount: { r: 8, w: 5, W: 2 }, acquireWaitCount: { w: 2 }, timeAcquiringMicros: { w: 3458 } }, Collection: { acquireCount: { R: 8, W: 5 } } } protocol:op_command 478ms
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.888+0000 d22261| 2016-12-08T04:55:20.889+0000 I CONTROL  [conn6] mongod.exe    ...\src\mongo\util\stacktrace_windows.cpp(239)                                   mongo::printStackTrace+0x43
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.888+0000 d22261| 2016-12-08T04:55:20.889+0000 I CONTROL  [conn6] mongod.exe    ...\src\mongo\util\signal_handlers_synchronous.cpp(180)                          mongo::`anonymous namespace'::printSignalAndBacktrace+0x73
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.888+0000 d22261| 2016-12-08T04:55:20.889+0000 I CONTROL  [conn6] mongod.exe    ...\src\mongo\util\signal_handlers_synchronous.cpp(236)                          mongo::`anonymous namespace'::abruptQuit+0x83
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.888+0000 d22261| 2016-12-08T04:55:20.889+0000 I CONTROL  [conn6] mongod.exe    d:\th\minkernel\crts\ucrt\src\appcrt\misc\signal.cpp(522)                        raise+0x468
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.888+0000 d22261| 2016-12-08T04:55:20.889+0000 I CONTROL  [conn6] mongod.exe    d:\th\minkernel\crts\ucrt\src\appcrt\startup\abort.cpp(71)                       abort+0x39
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.890+0000 d22261| 2016-12-08T04:55:20.889+0000 I CONTROL  [conn6] mongod.exe    ...\src\mongo\util\assert_util.cpp(154)                                          mongo::invariantFailed+0x20c
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.890+0000 d22261| 2016-12-08T04:55:20.889+0000 I CONTROL  [conn6] mongod.exe    ...\src\mongo\db\clientcursor.cpp(244)                                           mongo::ClientCursorPin::deleteUnderlying+0x13c
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.890+0000 d22261| 2016-12-08T04:55:20.889+0000 I CONTROL  [conn6] mongod.exe    c:\program files (x86)\microsoft visual studio 14.0\vc\include\memory(1397)      std::unique_ptr<mongo::RangePreserver,std::default_delete<mongo::RangePreserver> >::~unique_ptr<mongo::RangePreserver,std::default_delete<mongo::RangePreserver> >+0x29
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.890+0000 d22261| 2016-12-08T04:55:20.889+0000 I CONTROL  [conn6] mongod.exe    ...\src\mongo\db\commands\mr.cpp(1659)                                           mongo::mr::MapReduceCommand::run+0x1b45
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.890+0000 d22261| 2016-12-08T04:55:20.889+0000 I CONTROL  [conn6] mongod.exe    ...\src\mongo\db\commands\dbcommands.cpp(1528)                                   mongo::Command::run+0x6db
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.890+0000 d22261| 2016-12-08T04:55:20.889+0000 I CONTROL  [conn6] mongod.exe    ...\src\mongo\db\commands\dbcommands.cpp(1438)                                   mongo::Command::execCommand+0xc80
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.890+0000 d22261| 2016-12-08T04:55:20.889+0000 I CONTROL  [conn6] mongod.exe    ...\src\mongo\db\run_commands.cpp(73)                                            mongo::runCommands+0x510
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.891+0000 d22261| 2016-12-08T04:55:20.889+0000 I CONTROL  [conn6] mongod.exe    ...\src\mongo\db\instance.cpp(273)                                               mongo::`anonymous namespace'::receivedRpc+0x1e7
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.891+0000 d22261| 2016-12-08T04:55:20.889+0000 I CONTROL  [conn6] mongod.exe    ...\src\mongo\db\instance.cpp(618)                                               mongo::assembleResponse+0x815
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.891+0000 d22261| 2016-12-08T04:55:20.889+0000 I CONTROL  [conn6] mongod.exe    ...\src\mongo\db\service_entry_point_mongod.cpp(135)                             mongo::ServiceEntryPointMongod::_sessionLoop+0x195
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.891+0000 d22261| 2016-12-08T04:55:20.889+0000 I CONTROL  [conn6] mongod.exe    c:\program files (x86)\microsoft visual studio 14.0\vc\include\functional(212)   std::_Func_impl<<lambda_0476eb5123845e42a2f452f2efde6866>,std::allocator<int>,void,std::shared_ptr<mongo::transport::Session> const & __ptr64>::_Do_call+0x43
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.891+0000 d22261| 2016-12-08T04:55:20.889+0000 I CONTROL  [conn6] mongod.exe    ...\src\mongo\transport\service_entry_point_utils.cpp(78)                        mongo::`anonymous namespace'::runFunc+0x25e
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.891+0000 d22261| 2016-12-08T04:55:20.890+0000 I CONTROL  [conn6] mongod.exe    c:\program files (x86)\microsoft visual studio 14.0\vc\include\thr\xthread(247)  std::_LaunchPad<std::unique_ptr<std::tuple<std::_Binder<std::_Unforced,void * __ptr64 (__cdecl&)(void * __ptr64),mongo::`anonymous namespace'::Context * __ptr64> >,std::default_delete<std::tuple<std::_Binder<std::_Unforced,void * __ptr64 (__cdecl&)(void * __ptr64),mongo::`anonymous namespace'::Context * __ptr64> > > > >::_Run+0x6f
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.891+0000 d22261| 2016-12-08T04:55:20.890+0000 I CONTROL  [conn6] mongod.exe    c:\program files (x86)\microsoft visual studio 14.0\vc\include\thr\xthread(210)  std::_Pad::_Call_func+0x9
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.891+0000 d22261| 2016-12-08T04:55:20.890+0000 I CONTROL  [conn6] mongod.exe    d:\th\minkernel\crts\ucrt\src\appcrt\startup\thread.cpp(92)                      invoke_thread_procedure+0x2c
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.891+0000 d22261| 2016-12-08T04:55:20.890+0000 I CONTROL  [conn6] mongod.exe    d:\th\minkernel\crts\ucrt\src\appcrt\startup\thread.cpp(115)                     thread_start<unsigned int (__cdecl*)(void * __ptr64)>+0x93
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.891+0000 d22261| 2016-12-08T04:55:20.890+0000 I CONTROL  [conn6] kernel32.dll                                                                                   BaseThreadInitThunk+0xd
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.891+0000 d22261| 2016-12-08T04:55:20.890+0000 F -        [conn6] Got signal: 22 (SIGABRT).
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.891+0000 d22261| 2016-12-08T04:55:20.890+0000 I CONTROL  [conn6] *** unhandled exception 0x0000000E at 0x000007FEFCEBA06D, terminating
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.891+0000 d22261| 2016-12-08T04:55:20.890+0000 I CONTROL  [conn6] *** stack trace for unhandled exception:
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.898+0000 d22261| 2016-12-08T04:55:20.899+0000 I CONTROL  [conn6] KERNELBASE.dll                                                                                   RaiseException+0x3d
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.898+0000 d22261| 2016-12-08T04:55:20.899+0000 I CONTROL  [conn6] mongod.exe      ...\src\mongo\util\signal_handlers_synchronous.cpp(237)                          mongo::`anonymous namespace'::abruptQuit+0x9b
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.898+0000 d22261| 2016-12-08T04:55:20.899+0000 I CONTROL  [conn6] mongod.exe      d:\th\minkernel\crts\ucrt\src\appcrt\misc\signal.cpp(522)                        raise+0x468
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.900+0000 d22261| 2016-12-08T04:55:20.899+0000 I CONTROL  [conn6] mongod.exe      d:\th\minkernel\crts\ucrt\src\appcrt\startup\abort.cpp(71)                       abort+0x39
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.900+0000 d22261| 2016-12-08T04:55:20.899+0000 I CONTROL  [conn6] mongod.exe      ...\src\mongo\util\assert_util.cpp(154)                                          mongo::invariantFailed+0x20c
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.900+0000 d22261| 2016-12-08T04:55:20.899+0000 I CONTROL  [conn6] mongod.exe      ...\src\mongo\db\clientcursor.cpp(244)                                           mongo::ClientCursorPin::deleteUnderlying+0x13c
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.900+0000 d22261| 2016-12-08T04:55:20.899+0000 I CONTROL  [conn6] mongod.exe      c:\program files (x86)\microsoft visual studio 14.0\vc\include\memory(1397)      std::unique_ptr<mongo::RangePreserver,std::default_delete<mongo::RangePreserver> >::~unique_ptr<mongo::RangePreserver,std::default_delete<mongo::RangePreserver> >+0x29
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.900+0000 d22261| 2016-12-08T04:55:20.899+0000 I CONTROL  [conn6] mongod.exe      ...\src\mongo\db\commands\mr.cpp(1659)                                           mongo::mr::MapReduceCommand::run+0x1b45
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.900+0000 d22261| 2016-12-08T04:55:20.899+0000 I CONTROL  [conn6] mongod.exe      ...\src\mongo\db\commands\dbcommands.cpp(1528)                                   mongo::Command::run+0x6db
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.900+0000 d22261| 2016-12-08T04:55:20.899+0000 I CONTROL  [conn6] mongod.exe      ...\src\mongo\db\commands\dbcommands.cpp(1438)                                   mongo::Command::execCommand+0xc80
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.901+0000 d22261| 2016-12-08T04:55:20.900+0000 I CONTROL  [conn6] mongod.exe      ...\src\mongo\db\run_commands.cpp(73)                                            mongo::runCommands+0x510
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.901+0000 d22261| 2016-12-08T04:55:20.900+0000 I CONTROL  [conn6] mongod.exe      ...\src\mongo\db\instance.cpp(273)                                               mongo::`anonymous namespace'::receivedRpc+0x1e7
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.901+0000 d22261| 2016-12-08T04:55:20.900+0000 I CONTROL  [conn6] mongod.exe      ...\src\mongo\db\instance.cpp(618)                                               mongo::assembleResponse+0x815
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.901+0000 d22261| 2016-12-08T04:55:20.900+0000 I CONTROL  [conn6] mongod.exe      ...\src\mongo\db\service_entry_point_mongod.cpp(135)                             mongo::ServiceEntryPointMongod::_sessionLoop+0x195
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.901+0000 d22261| 2016-12-08T04:55:20.900+0000 I CONTROL  [conn6] mongod.exe      c:\program files (x86)\microsoft visual studio 14.0\vc\include\functional(212)   std::_Func_impl<<lambda_0476eb5123845e42a2f452f2efde6866>,std::allocator<int>,void,std::shared_ptr<mongo::transport::Session> const & __ptr64>::_Do_call+0x43
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.901+0000 d22261| 2016-12-08T04:55:20.900+0000 I CONTROL  [conn6] mongod.exe      ...\src\mongo\transport\service_entry_point_utils.cpp(78)                        mongo::`anonymous namespace'::runFunc+0x25e
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.901+0000 d22261| 2016-12-08T04:55:20.900+0000 I CONTROL  [conn6] mongod.exe      c:\program files (x86)\microsoft visual studio 14.0\vc\include\thr\xthread(247)  std::_LaunchPad<std::unique_ptr<std::tuple<std::_Binder<std::_Unforced,void * __ptr64 (__cdecl&)(void * __ptr64),mongo::`anonymous namespace'::Context * __ptr64> >,std::default_delete<std::tuple<std::_Binder<std::_Unforced,void * __ptr64 (__cdecl&)(void * __ptr64),mongo::`anonymous namespace'::Context * __ptr64> > > > >::_Run+0x6f
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.901+0000 d22261| 2016-12-08T04:55:20.900+0000 I CONTROL  [conn6] mongod.exe      c:\program files (x86)\microsoft visual studio 14.0\vc\include\thr\xthread(210)  std::_Pad::_Call_func+0x9
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.901+0000 d22261| 2016-12-08T04:55:20.900+0000 I CONTROL  [conn6] mongod.exe      d:\th\minkernel\crts\ucrt\src\appcrt\startup\thread.cpp(92)                      invoke_thread_procedure+0x2c
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.901+0000 d22261| 2016-12-08T04:55:20.900+0000 I CONTROL  [conn6] mongod.exe      d:\th\minkernel\crts\ucrt\src\appcrt\startup\thread.cpp(115)                     thread_start<unsigned int (__cdecl*)(void * __ptr64)>+0x93
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.901+0000 d22261| 2016-12-08T04:55:20.900+0000 I CONTROL  [conn6] kernel32.dll                                                                                     BaseThreadInitThunk+0xd
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.901+0000 d22261| 2016-12-08T04:55:20.900+0000 I -        [conn6]
[js_test:mapReduce_inSharded_outSharded] 2016-12-08T04:55:20.901+0000 d22261| 2016-12-08T04:55:20.900+0000 I CONTROL  [conn6] writing minidump diagnostic file C:\data\mci\b4699d47a7958c4241a69935b1137ca0\src\mongod.2016-12-08T04-55-20.mdmp

This patch seems to fix the issue, but causes this assertion in jstests/core/profile_mapReduce.js, I think because the removed AutoGetCollectionForRead was setting the CurOp's ns back to the source collection, even though the last operation was actually an insert into the temp collection that is eventually renamed into the output collection.

diff --git a/src/mongo/db/commands/mr.cpp b/src/mongo/db/commands/mr.cpp
index 3abd4a8..6c1afde 100644
--- a/src/mongo/db/commands/mr.cpp
+++ b/src/mongo/db/commands/mr.cpp
@@ -1417,7 +1417,8 @@ public:
         // be done under the lock.
         ON_BLOCK_EXIT([txn, &config, &rangePreserver] {
             if (rangePreserver) {
-                AutoGetCollectionForRead ctx(txn, config.ns);
+                // Be sure not to use AutoGetCollectionForRead here, as that may throw.
+                AutoGetCollection ctx(txn, NamespaceString(config.ns), MODE_IS);
                 rangePreserver.reset();
             }
         });



 Comments   
Comment by Githook User [ 01/Mar/17 ]

Author:

{u'username': u'cswanson310', u'name': u'Charlie Swanson', u'email': u'charlie.swanson@mongodb.com'}

Message: SERVER-27362 Avoid extra logic while getting lock to destroy M/R cursor.

The mapReduce commands opens a ClientCursor to prevent the RangeDeleter
from deleting data from chunks that have been migrated away from the
shard. To destroy the ClientCursor at the end of the command, it needs
to take a lock on the collection the cursor is registered on. It should
not use AutoGetCollectionForRead to acquire this lock since that will
do unnecessary extra tasks, including throwing an exception if the
shard version no longer matches. Instead, it should simply use
AutoGetCollection to get the required lock, to ensure the ClientCursor
can be cleaned up correctly.

(cherry picked from commit cc8caa857db1155d925ef6aa5c57ca3796aa4bdd)
Branch: v3.4
https://github.com/mongodb/mongo/commit/d89fbf8e72dc9ce8784cfcbccbd43cb3cc3d2c46

Comment by Githook User [ 19/Dec/16 ]

Author:

{u'username': u'cswanson310', u'name': u'Charlie Swanson', u'email': u'charlie.swanson@mongodb.com'}

Message: SERVER-27362 Avoid extra logic while getting lock to destroy M/R cursor.

The mapReduce commands opens a ClientCursor to prevent the RangeDeleter
from deleting data from chunks that have been migrated away from the
shard. To destroy the ClientCursor at the end of the command, it needs
to take a lock on the collection the cursor is registered on. It should
not use AutoGetCollectionForRead to acquire this lock since that will
do unnecessary extra tasks, including throwing an exception if the
shard version no longer matches. Instead, it should simply use
AutoGetCollection to get the required lock, to ensure the ClientCursor
can be cleaned up correctly.
Branch: master
https://github.com/mongodb/mongo/commit/cc8caa857db1155d925ef6aa5c57ca3796aa4bdd

Generated at Thu Feb 08 04:14:57 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.