[SERVER-43202] Aggregation system can continue trying to execute a query plan after being interrupted, leading to server-fatal invariant failure Created: 06/Sep/19  Updated: 29/Oct/23  Resolved: 02/Oct/19

Status: Closed
Project: Core Server
Component/s: Aggregation Framework, Querying
Affects Version/s: 4.2.0
Fix Version/s: 4.2.1, 4.3.1

Type: Bug Priority: Major - P3
Reporter: David Storch Assignee: Ian Boros
Resolution: Fixed Votes: 0
Labels: CTSA, KP42
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Duplicate
duplicates SERVER-55247 Invariant failure opCtx->lockState()... Closed
Problem/Incident
Related
related to SERVER-24198 When planning fails because there is ... Closed
is related to SERVER-55247 Invariant failure opCtx->lockState()... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.2
Steps To Reproduce:

I was able to reproduce this failure by first instrumenting the code with the following patch. The patch adds a bit of logging, but most importantly causes every query yield point to result in a ClientDisconnect error:

diff --git a/src/mongo/db/pipeline/pipeline_d.cpp b/src/mongo/db/pipeline/pipeline_d.cpp
index 12c95222fe..0b77a6be1d 100644
--- a/src/mongo/db/pipeline/pipeline_d.cpp
+++ b/src/mongo/db/pipeline/pipeline_d.cpp
@@ -695,6 +695,7 @@ StatusWith<std::unique_ptr<PlanExecutor, PlanExecutor::Deleter>> PipelineD::prep
     }
 
     if (sortStage) {
+        std::cout << "[dstorch] attempting to get a non-blocking sort plan" << std::endl;
         // See if the query system can provide a non-blocking sort.
         auto swExecutorSort =
             attemptToGetExecutor(opCtx,
@@ -708,6 +709,10 @@ StatusWith<std::unique_ptr<PlanExecutor, PlanExecutor::Deleter>> PipelineD::prep
                                  aggRequest,
                                  plannerOpts,
                                  matcherFeatures);
+        if (!swExecutorSort.isOK()) {
+            std::cout << "[dstorch] failed to get a non-blocking sort plan with code: "
+                      << swExecutorSort.getStatus() << std::endl;
+        }
 
         if (swExecutorSort.isOK()) {
             // Success! Now see if the query system can also cover the projection.
diff --git a/src/mongo/db/query/plan_yield_policy.cpp b/src/mongo/db/query/plan_yield_policy.cpp
index 43590f27d7..f8a945958b 100644
--- a/src/mongo/db/query/plan_yield_policy.cpp
+++ b/src/mongo/db/query/plan_yield_policy.cpp
@@ -84,6 +84,7 @@ void PlanYieldPolicy::resetTimer() {
 Status PlanYieldPolicy::yieldOrInterrupt(std::function<void()> whileYieldingFn) {
     invariant(_planYielding);
 
+
     if (_policy == PlanExecutor::INTERRUPT_ONLY) {
         ON_BLOCK_EXIT([this]() { resetTimer(); });
         OperationContext* opCtx = _planYielding->getOpCtx();
@@ -168,6 +169,8 @@ void PlanYieldPolicy::_yieldAllLocks(OperationContext* opCtx,
     // Attempt to check for interrupt while locks are not held, in order to discourage the
     // assumption that locks will always be held when a Plan Executor returns an error.
     if (_policy == PlanExecutor::YIELD_AUTO) {
+        std::cout << "[dstorch] marking operation killed on client disconnect" << std::endl;
+        opCtx->markKilled(ErrorCodes::ClientDisconnect);
         opCtx->checkForInterrupt();  // throws
     }

Next, I ran the following script against a build of the instrumented server. This configures the server to yield frequently, and then runs an aggregate command which exercises the flawed code path.

(function() {
"use strict";
 
db.c.drop();
assert.commandWorked(db.c.insert({a: 1, b: 1, c: 1}));
assert.commandWorked(db.c.insert({a: 1, b: 1, c: 1}));
 
assert.commandWorked(db.c.createIndex({a: 1, b: 1}));
assert.commandWorked(db.c.createIndex({a: 1, c: 1}));
 
// This is needed to make sure that a yield point is reached.
assert.commandWorked(db.adminCommand({setParameter: 1, internalQueryExecYieldIterations: 1}));
 
// This command triggers the crash.
db.c.aggregate([{$match: {a: 1, b: 1, c: 1}}, {$sort: {a: 1}}]).itcount();
}());

Sprint: Query 2019-09-09, Query 2019-09-23, Query 2019-10-07
Participants:
Case:
Linked BF Score: 11

 Description   

The aggregation subsystem may invoke the underlying multi-planning system multiple times in order to select its execution plan. For example, when there is a $sort it first attempts to construct a non-blocking $sort plan. If no such plan exists, it then invokes the plan selection code again without pushing down the $sort to the underlying PlanStage layer. See the code implementing this logic here.

If during this process the operation is killed—because the client disconnected, or due to an explicit killOp command, or due to a conflicting catalog event such as a collection drop—then this process should terminate at the next yield point and the operation should stop running. However, there is incorrect logic that accidentally swallows the interruption error and attempts to keep executing the aggregation operation. Specifically, we have checks to propagate the error code QueryPlanKilled which misses other possible error codes, e.g. ClientDisconnect.

Since an interrupt check happens at query yield points, the interruption error is propagated after locks have been relinquished. The consequence is that we attempt to keep executing the aggregation operation without holding the proper locks. This promptly triggers the following server-fatal invariant check, which makes an assertion that the necessary locks are held:

https://github.com/mongodb/mongo/blob/cdf0a059437f1ed417a22b3c574e2f73afc04233/src/mongo/db/catalog/database_holder_impl.cpp#L73-L74

When this scenario occurs, the server will shut down and and the logs will contain the following message, along with a backtrace:

2019-09-06T13:37:05.575-0400 F  -        [conn1] Invariant failure opCtx->lockState()->isDbLockedForMode(db, MODE_IS) || (db.compare("local") == 0 && opCtx->lockState()->isLocked()) src/mongo/db/catalog/database_holder_impl.cpp 74



 Comments   
Comment by Githook User [ 04/Oct/19 ]

Author:

{'name': 'Ian Boros', 'username': 'puppyofkosh', 'email': 'ian.boros@mongodb.com'}

Message: SERVER-43202 propagate errors from interrupts during yielding
Branch: v4.2
https://github.com/mongodb/mongo/commit/1f0ae1c45604798888f8b2425475a473d7265bd5

Comment by Githook User [ 01/Oct/19 ]

Author:

{'username': 'puppyofkosh', 'email': 'ian.boros@mongodb.com', 'name': 'Ian Boros'}

Message: SERVER-43202 propagate errors from interrupts during yielding
Branch: master
https://github.com/mongodb/mongo/commit/e77b9df00d5a269c8eb4955efef2c21c42819306

Generated at Thu Feb 08 05:02:32 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.