[SERVER-31844] mapReduce unsafely disposes of its PlanExecutor Created: 06/Nov/17  Updated: 24/Nov/23  Resolved: 07/Nov/17

Status: Closed
Project: Core Server
Component/s: MapReduce, Querying
Affects Version/s: 3.6.0-rc2
Fix Version/s: 3.6.0-rc4

Type: Bug Priority: Major - P3
Reporter: David Storch Assignee: David Storch
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Query 2017-11-13
Participants:
Linked BF Score: 0

 Description   

The PlanExecutor created by the mapReduce command here is managed by the collection over which the mapReduce is running. This means that a lock on the collection must be held while the PlanExecutor is being disposed, so that deregistration of the executor is synchronized correctly with catalog-level events such as collection drops. The following dassert() checks that the correct locks are held in debug builds:

https://github.com/mongodb/mongo/blob/dc04d7d6f22e6542f9f20cf33cd40015cefcf530/src/mongo/db/query/plan_executor.cpp#L653

The mapReduce code holds the proper locks for most of its execution, but these locks are temporarily released in order to perform the reduce step. It is possible for an exception to be thrown by
State::reduceAndSpillInMemoryStateIfNeeded(), in which case the PlanExecutor will be incorrectly disposed out of the lock.

Our continuous integration testing has caught this specifically in the case of a mapReduce running while the mongod is shutting down, since this causes State::reduceAndSpillInMemoryStateIfNeeded() to throw with an InterruptedAtShutdown error. The result is an invariant() failure in a debug build, though the system could also crash in a non-debug build due to using a Collection object that has been freed.



 Comments   
Comment by Githook User [ 07/Nov/17 ]

Author:

{'name': 'David Storch', 'username': 'dstorch', 'email': 'david.storch@10gen.com'}

Message: SERVER-31844 Ensure that mapReduce holds the necessary locks for PlanExecutor disposal.
Branch: master
https://github.com/mongodb/mongo/commit/5f3151dc48951aa552284559a55b75b5ee55fd4e

Comment by David Storch [ 06/Nov/17 ]

I can reproduce this by applying the following patch:

diff --git a/src/mongo/db/commands/mr.cpp b/src/mongo/db/commands/mr.cpp
index 7f39b45a3c..76b372fc41 100644
--- a/src/mongo/db/commands/mr.cpp
+++ b/src/mongo/db/commands/mr.cpp
@@ -74,6 +74,7 @@
 #include "mongo/s/stale_exception.h"
 #include "mongo/scripting/engine.h"
 #include "mongo/stdx/mutex.h"
+#include "mongo/util/fail_point.h"
 #include "mongo/util/log.h"
 #include "mongo/util/mongoutils/str.h"
 #include "mongo/util/scopeguard.h"
@@ -95,6 +96,8 @@ namespace mr {
 
 AtomicUInt32 Config::JOB_NUMBER;
 
+MONGO_FP_DECLARE(throwDuringMRReduce);
+
 JSFunction::JSFunction(const std::string& type, const BSONElement& e) {
     _type = type;
     _code = e._asCode();
@@ -1555,6 +1558,11 @@ public:
 
                         scopedAutoDb.reset();
 
+                        if (MONGO_FAIL_POINT(throwDuringMRReduce)) {
+                            uasserted(ErrorCodes::OperationFailed,
+                                      "mapReduce failed due to 'throwDuringMRReduce' fail point");
+                        }
+
                         state.reduceAndSpillInMemoryStateIfNeeded();
 
                         scopedAutoDb.reset(new AutoGetDb(opCtx, config.nss.db(), MODE_S));

Configuring the failpoint and then running pretty much any mapReduce should trigger the invariant() in a debug build. The failpoint can be enabled like so, when test commands have also been enabled:

db.adminCommand({configureFailPoint: "throwDuringMRReduce", mode: "alwaysOn"});

Generated at Thu Feb 08 04:28:22 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.