-
Type:
Bug
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
Query Integration
-
ALL
-
None
-
None
-
None
-
None
-
None
-
None
-
None
When a MongoDB operation using a $where predicate is interrupted (e.g. by maxTimeMS expiration), the WASM JS engine returns ErrorCodes::Interrupted (11601) instead of the actual interrupt reason such as MaxTimeMSExpired (50).
Root Cause
When maxTimeMS expires, the following sequence occurs:
- The OperationContext is marked killed with MaxTimeMSExpired (code 50)
- WasmtimeScriptEngine::interrupt() calls scope->kill(), which increments the wasmtime epoch to signal the WASM engine
- The WASM engine fires a wasm trap: interrupt
- MozJSWasmBridge::_callFunc in src/mongo/scripting/mozjs/wasm/bridge/bridge.h catches the trap and, seeing isKillPending() == true, unconditionally throws ErrorCodes::Interrupted with the raw wasmtime backtrace message
At step 4, the bridge has no reference to the OperationContext and cannot consult the actual kill reason. It always emits Interrupted regardless of whether the kill was caused by maxTimeMS, an explicit killOp, or any other reason.
Impact
Any code that distinguishes between interrupt reasons — such as tests or retry logic that checks for MaxTimeMSExpired vs. Interrupted — will behave incorrectly when $where is involved. The test jstests/noPassthrough/interruption/max_time_ms_does_not_leak_shard_cursor.js demonstrates this: it expects MaxTimeMSExpired (50) or NetworkInterfaceExceededTimeLimit (202) but receives Interrupted (11601).
Proposed Fix
WasmtimeImplScope (in src/mongo/scripting/mozjs/wasm/scope/scope.cpp) already holds a pointer to the OperationContext via _opCtx. In WasmtimeImplScope::invoke(), wrap each bridge call (invokePredicate, invokeFunction, invokeMap) in a try-catch block. On exception, if _bridge->isKillPending() and _opCtx is set, call _opCtx->checkForInterrupt(), which will re-throw with the correct error code (e.g. MaxTimeMSExpired). If the opCtx was not the source of the kill (e.g. a DeadlineMonitor-only timeout), checkForInterrupt() will not throw and the original exception propagates unchanged.
As a secondary fix, the same try-catch blocks also ensure _deadlineMonitor.stopDeadline() is called on exception paths, where it is currently skipped.
Steps to Reproduce
Run jstests/noPassthrough/interruption/max_time_ms_does_not_leak_shard_cursor.js on a build with the WASM JS engine enabled. The first test block (the $where + maxTimeMS case) will fail with:
Error: [11601] and [[ 50, 202 ]] are not equal
Affected Components
- src/mongo/scripting/mozjs/wasm/bridge/bridge.h — _callFunc throws Interrupted unconditionally
- src/mongo/scripting/mozjs/wasm/scope/scope.cpp — invoke() does not re-check opCtx kill status after a WASM interrupt
- related to
-
SERVER-116052 Add support for $function
-
- Closed
-