-
Type:
Bug
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
Query Optimization
-
ALL
-
None
-
None
-
None
-
None
-
None
-
None
-
None
PathArrayness incorrectly reports that a field cannot be an array when the only index information about that field comes from a positional path (e.g., e.0). This causes the dependency graph's canPathBeArray() analysis to return false for paths that actually can (and do) contain arrays, leading to failures in the aggregation_dependency_graph_validation_passthrough suite.
Root Cause
When a BTree index has a key path like "e.0" and the document has e: ["elem1"], the BtreeKeyGenerator uses positional array access to extract e[0] without generating multiple keys. This means the index is not marked as multikey at the e level (multikeyPaths for that key path is empty).
PathArrayness::TrieNode::insertPath() then creates a trie node for "e" with canBeArray = false because multikeyPaths.count(0) == 0.
When the dependency graph later calls canPathBeArray("e"), the trie lookup finds the "e" node and returns false. But the field e is an array — the non-multikey status only means the index doesn't expand the array into multiple keys due to positional access, not that the field itself isn't an array.
Impact
The $_internalValidateArrayness stage detects this mismatch at runtime and throws error 12508302:
Dependency graph arrayness validation failed: field 'e' contains an array but canPathBeArray() returned false.
This causes failures in:
- jstests/aggregation/sources/project/remove_redundant_projects.js — the $project with $filter reads field e (which is ["elem1"]), with index {a: 1, "c.d": 1, "e.0": 1}
- jstests/aggregation/sources/lookup/lookup_equijoin_semantics_inlj.js — similar interaction with positional indexed paths
Reproduction
buildscripts/resmoke.py run --force-excluded-tests \ --suites=aggregation_dependency_graph_validation_passthrough \ jstests/aggregation/sources/project/remove_redundant_projects.js
Unit tests demonstrating the bug:
bazel run +path_arrayness_test -- --gtest_filter="*Positional*"
Fix
PathArrayness should not conclude that a path prefix cannot be an array solely because an index with a positional (numeric) path component is not multikey at that depth. When a path like "e.0" has no multikey component at depth 0, the trie should either:
- Not insert the parent node "e" at all (leaving it unknown → conservative true), or
- Detect that the next component is numeric (a positional accessor) and mark the parent as conservatively possibly-array
Unit Tests
Added in src/mongo/db/query/compiler/metadata/path_arrayness_test.cpp:
- PathArraynessTest.PositionalIndexPathShouldNotMarkParentAsNonArray
- PathArraynessTest.CompoundIndexWithPositionalPathDoesNotAffectParentArrayness
- is depended on by
-
SERVER-127302 Dependency graph arrayness tracking incorrect for $lookup with sorted INLJ indexes
-
- Needs Scheduling
-
-
SERVER-125083 Introduce a new jstest suite for testing the dependency graph
-
- In Code Review
-