-
Type:
Bug
-
Resolution: Fixed
-
Priority:
Major - P3
-
Affects Version/s: None
-
Component/s: None
-
Query Integration
-
Fully Compatible
-
ALL
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Summary
The query-correctness-tests-4 reference files contain $regularExpression pattern values with a spurious trailing forward slash — e.g. "pattern":"methodologies|Advanced|extend|SMS/" — wherever the result was produced by a $function aggregation expression returning a regex literal. This trailing slash is not part of the regex pattern; it is an artifact of a long-standing bug in ValueWriter::toRegEx() in the MozJS scripting engine, which was the engine used to generate the reference files.
Root Cause
toRegEx() reconstructs a BSON regex from a SpiderMonkey regex object by calling toString() (which returns "/pattern/flags") and then splitting on the last /:
return JSRegEx(regexStr.substr(1, regexStr.rfind('/')), // BUG: count == position regexStr.substr(regexStr.rfind('/') + 1));
std::string::substr(pos, count) takes a length, not an end position. Because rfind('/') returns the index of the closing delimiter, passing it as the count yields a substring one character too long — it includes the closing delimiter itself. For /SMS/ this produces pattern SMS/ instead of SMS.
This is the exact defect tracked in SERVER-98936 ("Fix encoding of regex patterns returned from $function"), which was closed Won't Fix in January 2025, and is the underlying cause reported in SERVER-98945 ("In-query functions incorrectly process regexes nested in arrays"), where the same \/-escaping ambiguity surfaces in the input direction.{}
The reference files were generated against the MozJS engine while the bug was active. Every $regularExpression value that originated from a $function return went through toRegEx() and therefore has an extra trailing / in its "pattern" field. No collection data is affected — the .coll files contain no regex fields; all $regularExpression values in results come exclusively from $function computations.
Fix
- Fix ValueWriter::toRegEx() by changing rfind('/') to rfind('/') - 1 so only the pattern body is captured.
- Fix ValueWriter::_writeObject() to use JS::GetRegExpSource() JS::GetRegExpFlags() directly, avoiding the toString() rfind round-trip entirely — which eliminates ambiguity when the pattern itself contains \/.
- Update the 154 affected .results files in query-correctness-tests-4 to remove the trailing / from each "pattern" field.
- Bump query-correctness-tests-4 commit hash in test_repos.conf.
Related tickets
SERVER-98936— original bug report for the trailing-slash artifact in $function regex returns (closed Won't Fix)- SERVER-98945 — adjacent manifestation of the same \/ escaping ambiguity for regexes in arrays passed to $function
SERVER-116052— WASM JS engine work, where the cross-engine result discrepancy exposed the bug
- is fixed by
-
SERVER-116052 Add support for $function
-
- Closed
-
- is related to
-
SERVER-127016 Extend regex trailing-slash fix to query-correctness-tests-1
-
- Open
-
-
SERVER-98945 In-query functions incorrectly process regexes nested in arrays
-
- Backlog
-
-
SERVER-98936 Fix encoding of regex patterns returned from $function
-
- Closed
-
-
SERVER-127015 Update jstestfuzz to use the new MozJS regex handling
-
- Investigating
-
- related to
-
SERVER-98945 In-query functions incorrectly process regexes nested in arrays
-
- Backlog
-
-
SERVER-98936 Fix encoding of regex patterns returned from $function
-
- Closed
-
-
SERVER-116052 Add support for $function
-
- Closed
-