|
As part of MatchExpression::optimize(), we have logic to try to rewrite an $or of equalities over the same path to an $in. This is advantageous, because it helps the downstream optimization code produce better plans. Here's an example of the rewrite:
// Original match expression.
|
{$or: [{name: "Don"}, {name: "Alice"}]}
|
|
// This gets rewritten to the following.
|
{name: {$in: ["Alice", "Don"]}}
|
When not all of the clauses of the $or can get rewritten in this manner, the current implementation can output a match expression tree with an $or that is a direct child of another $or. Here's an example:
// Original match expression.
|
{$or: [{name: "Don"}, {name: "Alice"}, {age: 42}, {job: "Software Engineer"}]}
|
|
// This gets rewritten to the following.
|
{$or: [{name: {$in: ["Alice", "Don"]}}, {$or: [{age: 42}, {job: "Software Engineer"}]}]}
|
As you can see, one of the direct children of the outer $or is another $or. There is a separate rewrite which happens as part of MatchExpression::optimize() which attempts to flatten such nested $or nodes. However, the $or -> $in rewrite happens afterwords. In the master branch, the $or-$or is subsequently simplified by the new boolean simplification module enabled in SERVER-81630, but in older branches the $or-$or is never simplified.
I would argue that despite boolean simplification, we should modify the implementation of the $or -> $in rewrite to avoid constructing directly nested $or nodes. Continuing the example above, the output of the $or rewrite should be as follows:
{$or: [{name: {$in: ["Alice", "Don"]}}, {age: 42}, {job: "Software Engineer"}]}
|
|