[SERVER-47661] Ensure consistent sort-order for MatchExpression trees Created: 20/Apr/20  Updated: 06/Dec/22

Status: Backlog
Project: Core Server
Component/s: Querying
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Bernard Gorman Assignee: Backlog - Query Execution
Resolution: Unresolved Votes: 0
Labels: qexec-team
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-47664 Improved MatchExpression equivalence ... Backlog
Assigned Teams:
Query Execution
Participants:

 Description   

At present, our MatchExpression sorting algorithm orders nodes based the following criteria:

1) operator type (MatchExpression::MatchType)
2) path name (MatchExpression::path())
3) sort order of children
4) number of children (MatchExpression::numChildren())

It does not, however, sort the leaf predicate values of branches which are otherwise identical. This means that it is not always possible to generate a properly normalized MatchExpression tree, since two functionally identical trees will not sort to the same output tree if their input predicate order differs. In turn, this means that MatchExpression::equivalent will not regard the trees as equivalent even if they are trivially identical.

For instance, take an input query like:

{$or: [{a: {$gt: 1}}, {a: {$lt: 1}}, {a: {$gt: 0}}, {a: {$lt: 0}}]}

When sorted, we will only ensure that all the $lt predicates sort before the $gt predicates, but we will not sort the actual values of each predicate type. So depending on the particular order in which the input predicates were specified, the "sorted" output might be any of:

{$or: [{a: {$lt: 0}}, {a: {$lt: 1}}, {a: {$gt: 0}}, {a: {$gt: 1}}]}
{$or: [{a: {$lt: 1}}, {a: {$lt: 0}}, {a: {$gt: 0}}, {a: {$gt: 1}}]}
{$or: [{a: {$lt: 1}}, {a: {$lt: 0}}, {a: {$gt: 1}}, {a: {$gt: 0}}]}
...
etc

By adding the ability to sort leaf predicate values, we can guarantee that the above query will always sort to a single canonical form, regardless of input predicate order:

{$or: [{a: {$lt: 0}}, {a: {$lt: 1}}, {a: {$gt: 0}}, {a: {$gt: 1}}]}


Generated at Thu Feb 08 05:14:52 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.