[SERVER-38222] OP_KILL_CURSORS may not actually kill all (or any) cursors if one is pinned Created: 21/Nov/18  Updated: 06/Dec/22

Status: Backlog
Project: Core Server
Component/s: Querying
Affects Version/s: 3.2.21, 3.4.18, 3.6.9, 4.0.4, 4.1.5
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Ian Boros Assignee: Backlog - Query Execution
Resolution: Unresolved Votes: 0
Labels: query-44-grooming
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Query Execution
Operating System: ALL
Participants:

 Description   

If you run an OP_KILL_CURSORS with a list of cursor Ids [a, b, c, d...] and one of the cursors is pinned, the operation will fail early, and none of the cursors which come later in the list will be killed. For example, if cursor b is pinned, then cursors c and d would not be killed (or even examined).

This is because CursorManager::pinCursor may uassert or return a Status. We call killCursorGlobalIfAuthorized in a loop here attempts to kill each cursor one by one. If one of the cursors is pinned, the uassert triggers, and the entire operation fails, meaning none of the subsequent cursors get examined.

I do not believe this is possible to reproduce with the shell, since it seems like we only ever create OP_KILL_CURSORS messages with one cursor ID. However, according to the drivers spec, some drivers will kill cursors in batches:

Secondly, some drivers have a background cursor reaper to kill cursors that aren't exhausted and closed. Due to GC semantics ...

See the driver specs here.

It's worth mentioning that drivers probably only kill cursors once the user has stopped reading from them, meaning none of the cursors would be pinned. In that case, they wouldn't encounter this issue.

Here's a repro.
First apply this patch to your mongo/ repo (it adds a shell helper to use OP_KILL_CURSORS to kill multiple cursors at once). My base revision was b92c7480df5406cb9487228af36cb10bb973e77c.

diff --git a/src/mongo/db/dbmessage.cpp b/src/mongo/db/dbmessage.cpp
index 08915bc48e..4dfa7f754e 100644
--- a/src/mongo/db/dbmessage.cpp
+++ b/src/mongo/db/dbmessage.cpp
@@ -204,6 +204,17 @@ Message makeKillCursorsMessage(long long cursorId) {
     });
 }
 
+    Message makeKillCursorsMessage(std::vector<long long> cursorIds) {
+    return makeMessage(dbKillCursors, [&](BufBuilder& b) {
+        b.appendNum((int)0);  // reserved
+        b.appendNum((int)cursorIds.size());  // number of cursors.
+        for (auto&& id : cursorIds) {
+            b.appendNum(id);
+        }
+    });
+}
+
+
 Message makeGetMoreMessage(StringData ns, long long cursorId, int nToReturn, int flags) {
     return makeMessage(dbGetMore, [&](BufBuilder& b) {
         b.appendNum(flags);
diff --git a/src/mongo/db/dbmessage.h b/src/mongo/db/dbmessage.h
index d55ba0b994..4325a54e8e 100644
--- a/src/mongo/db/dbmessage.h
+++ b/src/mongo/db/dbmessage.h
@@ -435,6 +435,11 @@ Message makeRemoveMessage(StringData ns, BSONObj query, int flags = 0);
  */
 Message makeKillCursorsMessage(long long cursorId);
 
+/**
+ * Builds a legacy OP_KILLCURSORS message for multiple cursor ids.
+ */
+Message makeKillCursorsMessage(std::vector<long long> cursorIds);
+
 /**
  * Builds a legacy OP_GETMORE message.
  */
diff --git a/src/mongo/scripting/mozjs/mongo.cpp b/src/mongo/scripting/mozjs/mongo.cpp
index 92d60d4eee..56f0ce4c13 100644
--- a/src/mongo/scripting/mozjs/mongo.cpp
+++ b/src/mongo/scripting/mozjs/mongo.cpp
@@ -84,6 +84,7 @@ const JSFunctionSpec MongoBase::methods[] = {
     MONGO_ATTACH_JS_CONSTRAINED_METHOD_NO_PROTO(isReplicaSetMember, MongoExternalInfo),
     MONGO_ATTACH_JS_CONSTRAINED_METHOD_NO_PROTO(isMongos, MongoExternalInfo),
     MONGO_ATTACH_JS_CONSTRAINED_METHOD_NO_PROTO(_startSession, MongoExternalInfo),
+    MONGO_ATTACH_JS_CONSTRAINED_METHOD_NO_PROTO(killCursorsLegacy, MongoExternalInfo),
     JS_FS_END,
 };
 
@@ -762,6 +763,24 @@ void MongoBase::Functions::getMaxWireVersion::call(JSContext* cx, JS::CallArgs a
     args.rval().setInt32(conn->getMaxWireVersion());
 }
 
+void MongoBase::Functions::killCursorsLegacy::call(JSContext* cx, JS::CallArgs args) {
+    // Not exactly sure how to do this with an array. I'm not going to commit this though, so it doesn't matter.
+    auto scope = getScope(cx);
+    for (int i = 0; i < 3; i++) {
+        if (!scope->getProto<NumberLongInfo>().instanceOf(args.get(i)))
+            uasserted(ErrorCodes::BadValue, "all args must be a NumberLong");
+    }
+
+    long long cursorId0 = NumberLongInfo::ToNumberLong(cx, args.get(0));
+    long long cursorId1 = NumberLongInfo::ToNumberLong(cx, args.get(1));
+    long long cursorId2 = NumberLongInfo::ToNumberLong(cx, args.get(2));
+    
+    auto conn = getConnection(args);
+    auto toSend = makeKillCursorsMessage({cursorId0, cursorId1, cursorId2});
+    conn->say(toSend);
+    args.rval().setInt32(1);
+}
+
 void MongoBase::Functions::isReplicaSetMember::call(JSContext* cx, JS::CallArgs args) {
     auto conn = getConnection(args);
 
diff --git a/src/mongo/scripting/mozjs/mongo.h b/src/mongo/scripting/mozjs/mongo.h
index f99f50b633..7752de0a1f 100644
--- a/src/mongo/scripting/mozjs/mongo.h
+++ b/src/mongo/scripting/mozjs/mongo.h
@@ -68,9 +68,10 @@ struct MongoBase : public BaseInfo {
         MONGO_DECLARE_JS_FUNCTION(isReplicaSetMember);
         MONGO_DECLARE_JS_FUNCTION(isMongos);
         MONGO_DECLARE_JS_FUNCTION(_startSession);
+        MONGO_DECLARE_JS_FUNCTION(killCursorsLegacy);
     };
 
-    static const JSFunctionSpec methods[23];
+    static const JSFunctionSpec methods[24];
 
     static const char* const className;
     static const unsigned classFlags = JSCLASS_HAS_PRIVATE;

Then run this test (I called it "killcursors_bug.js"):

(function () {
    // This test runs manual getMores using different connections, which will not inherit the
    // implicit session of the cursor establishing command.
    TestData.disableImplicitSessions = true;
 
    const coll = db["legacy_kill_multiple_cursor"];
    const adminDB = db.getSiblingDB("admin");
 
    for (let i = 0; i < 10; i++) {
        assert.commandWorked(coll.insert({x: 1}));
    }
 
    function startCursor() {
        // Start an aggregation.
        const csCmdRes = assert.commandWorked(
            coll.runCommand({aggregate: coll.getName(), pipeline: [], cursor: {batchSize: 3}}));
 
        return csCmdRes.cursor.id;
    }
 
    const id1 = startCursor();
    const id2 = startCursor();
    const id3 = startCursor();
 
    // Run a getMore (and make it hang on a failpoint) so that cursor 2 is pinned.
    const kFailPointName = "waitAfterPinningCursorBeforeGetMoreBatch";
    assert.commandWorked(
        db.adminCommand({configureFailPoint: kFailPointName, mode: 'alwaysOn'}));
 
    let code = `const collName = "${coll.getName()}";`;
    code += `const cursorId = ${id2};`;
    function runGetMore() {
        const getMoreCmd = {getMore: cursorId, collection: collName, batchSize: 4};
        assert.commandWorked(db.runCommand(getMoreCmd));
    }
    code += `(${runGetMore.toString()})();`;
    const getMoreJoiner = startParallelShell(code);
 
    // Wait until we're hanging on the failpoint (TODO: Use CurOp for this).
    sleep(2 * 1000);
 
    // Make the shell send a legacy OP_KILL_CURSORS message with three cursor ids. Cursors
    // 1 and 3 are unpinned, and can be killed. Cursor 2 is pinned, and cannot be killed.
    db.getMongo().killCursorsLegacy(id1, id2, id3);
 
    // Check that id1 is no longer present as an idleCursor, but id3 is.
    const idleCursorsAndOps =
          adminDB
          .aggregate([
              {"$currentOp": {"localOps": true, "idleCursors": true, "allUsers": false}},
          ])
          .toArray();
 
    printjson(idleCursorsAndOps);
 
    for (let res of idleCursorsAndOps) {
        printjson(res);
        if (res.type === "op") {
            if (res.command.hasOwnProperty("aggregate")) {
                // This entry refers to the aggregation with the $currentOp stage we just ran.
                continue;
            }
            else {
                // It's the active getMore.
                assert.eq(res.command.getMore, id2, res);
            }
        } else {
            assert.eq(res.type, "idleCursor");
            // The fact that this assert below doesn't trigger is a bug. Cursor 3 should have been killed.
            assert.eq(res.cursor.cursorId, id3);
        }
    }
 
    assert.commandWorked(db.adminCommand({configureFailPoint: kFailPointName, mode: 'off'}));
    getMoreJoiner();
 
})();

resmoke.py --suites=no_passthrough_with_mongod killcursors_bug.js

CC david.storch since he first imagined this being a problem



 Comments   
Comment by Matt Broadstone [ 06/Dec/18 ]

behackett it does not

Comment by Bernie Hackett [ 04/Dec/18 ]

matt.broadstone, does Ruby do that against MongoDB 3.2+ (where the find and killCursors commands exist)? If so, why?

Comment by Bernie Hackett [ 04/Dec/18 ]

PyMongo may in some cases call OP_KILL_CURSORS with a list of cursors on MongoDB <= 3.0 and the killCursors command on MongoDB 3.2+.

Comment by Matt Broadstone [ 03/Dec/18 ]

This is not a problem for node (it uses the killCursors command only), but ruby does use OP_KILL_CURSORS and attempts to kill 10k batches at a time.

Comment by David Storch [ 03/Dec/18 ]

jeff.yemin, confirmed, this only affects OP_KILL_CURSORS. It does not affect killCursors command.

Comment by Jeffrey Yemin [ 03/Dec/18 ]

I did some digging and looks like java, .net, and go drivers kill a cursor at a time.

Comment by Jeffrey Yemin [ 02/Dec/18 ]

ian.boros can you confirm that this affects only OP_KILL_CURSORS and not the killCursors command?

Comment by Bernie Hackett [ 01/Dec/18 ]

I don't know if this is common or not jeff.yemin matt.broadstone jmikola, can you check with your teams?

Comment by Ian Boros [ 29/Nov/18 ]

No, sorry for being vague, I was just saying that I can confirm this affects all of the recently released versions of the server after a cursory inspection. I took a closer look at 3.2 and 3.4, and it looks like this bug also affects them, since the creation of a ClientCursorPin here can throw. See this and this.

Comment by Charlie Swanson [ 29/Nov/18 ]

ian.boros does that also mean that as far as you can tell it does not impact 3.4 or earlier?

Comment by Ian Boros [ 29/Nov/18 ]

As far as I can tell, this goes back to 3.6.

Comment by Craig Homa [ 29/Nov/18 ]

Hey behackett, do you think this could potentially be a big issue for any drivers? 

Is it common driver behavior to send an OP_KILL_CURSORS with multiple cursor id's?

Also, ian.boros please check on which server versions are affected by this.

Generated at Thu Feb 08 04:48:19 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.