-
Type: Bug
-
Resolution: Done
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: Aggregation Framework
-
None
-
Query
-
ALL
For starters, this can only happen if your batchSize is 0, since $out returns no results and the error needs to take place in a getMore. The following series of events can occur, eventually triggering a dassert, or potentially worse (I don't fully understand the implications of having the flush lock locked in the wrong mode).
- An aggregate starts running inside a getMore. The cursor associated with the aggregation is pinned.
- An error occurs, throwing a UserException (in my instance an inserted document failed document validation).
- This ScopeGuard is destructed, triggering it to clean up the cursor.
- GetMoreCmd::cleanupCursor() is called, taking the global, db, and collection lock in IS mode. I believe this is required to access the CursorManager.
- ClientCursorPin::deleteUnderlying() is called, triggering kills and destructions eventually leading to the $out stage's destruction.
- The $out stage's destructor is triggered, and attempts to clean up the temporary collection it created and partially filled.
- The drop command attempts to take the DB lock in X mode, which is an upgrade from the held IS lock, which eventually triggers this dassert().
I think this can also happen on the 3.2 branch and earlier, but I haven't verified.
- depends on
-
SERVER-22541 Aggregation plan executors should be owned by global cursor manager
- Closed