-
Type:
Improvement
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: Querying
-
Query Execution
-
None
-
None
-
None
-
None
-
None
-
None
-
None
At present, our MatchExpression sorting algorithm orders nodes based the following criteria:
1) operator type (MatchExpression::MatchType)
2) path name (MatchExpression::path())
3) sort order of children
4) number of children (MatchExpression::numChildren())
It does not, however, sort the leaf predicate values of branches which are otherwise identical. This means that it is not always possible to generate a properly normalized MatchExpression tree, since two functionally identical trees will not sort to the same output tree if their input predicate order differs. In turn, this means that MatchExpression::equivalent will not regard the trees as equivalent even if they are trivially identical.
For instance, take an input query like:
{$or: [{a: {$gt: 1}}, {a: {$lt: 1}}, {a: {$gt: 0}}, {a: {$lt: 0}}]}
When sorted, we will only ensure that all the $lt predicates sort before the $gt predicates, but we will not sort the actual values of each predicate type. So depending on the particular order in which the input predicates were specified, the "sorted" output might be any of:
{$or: [{a: {$lt: 0}}, {a: {$lt: 1}}, {a: {$gt: 0}}, {a: {$gt: 1}}]} {$or: [{a: {$lt: 1}}, {a: {$lt: 0}}, {a: {$gt: 0}}, {a: {$gt: 1}}]} {$or: [{a: {$lt: 1}}, {a: {$lt: 0}}, {a: {$gt: 1}}, {a: {$gt: 0}}]} ... etc
By adding the ability to sort leaf predicate values, we can guarantee that the above query will always sort to a single canonical form, regardless of input predicate order:
{$or: [{a: {$lt: 0}}, {a: {$lt: 1}}, {a: {$gt: 0}}, {a: {$gt: 1}}]}
- related to
-
SERVER-47664 Improved MatchExpression equivalence checking
-
- Backlog
-