[SERVER-79205] [CQF] $not $eq array, on a missing field, should return true Created: 21/Jul/23  Updated: 22/Sep/23

Status: Open
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: David Percy Assignee: Backlog - Query Optimization
Resolution: Unresolved Votes: 0
Labels: bonsai-semantic-difference
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to SERVER-79488 [CQF] Different behavior with empty-a... Open
is related to SERVER-80239 [CQF] Re-consider comparison against ... Backlog
is related to SERVER-81376 [CQF] Disable NotPushdown by default Closed
Assigned Teams:
Query Optimization
Operating System: ALL
Sprint: QO 2023-08-21
Participants:

 Description   

This is something the fuzzer discovered.

Example collection and queries on classic:

> db1.c.find()
{ "_id" : 5 }
 
> db1.c.find({array: {$eq: ['a']}})
// empty results
 
> db1.c.find({array: {$not: {$eq: ['a']}}})
{ "_id" : 5 }

Same example with forceBonsai:

> db2.c.find()
{ "_id" : 5 }
 
> db2.c.find({array: {$eq: ['a']}})
// empty results
 
> db2.c.find({array: {$not: {$eq: ['a']}}})
// empty results -- incorrect

This only happens when the constant is an array. For example this query is correct:

> db1.c.find({array: {$not: {$eq: 'a'}}})
{ "_id" : 5 }
> db2.c.find({array: {$not: {$eq: 'a'}}})
{ "_id" : 5 }

The Bonsai plan before optimization is:

********* Translated ABT *********
explain : Root [{p0}]
Filter []
|   EvalFilter []
|   |   Variable [p0]
|   PathConstant [] UnaryOp [Not] EvalFilter []
|   |   Variable [p0]
|   PathGet [array] PathComposeA []
|   |   PathCompare [Eq] Const [["a"]]
|   PathTraverse [1] PathCompare [Eq] Const [["a"]]
Scan [c_808eb2a6-144a-4226-a997-e279bb2cc7c3, {p0}]
 
********* Translated ABT *********

and after optimization:

********* Optimized ABT *********
explain :
Root [{p0}]
Filter []
|   EvalFilter []
|   |   Variable [p0]
|   PathGet [array] PathCompare [Neq] Const [["a"]]
Filter []
|   EvalFilter []
|   |   Variable [p0]
|   PathGet [array] PathLambda [] LambdaAbstraction [p2] UnaryOp [Not] EvalFilter []
|   |   Variable [p2]
|   PathTraverse [1] PathCompare [Eq] Const [["a"]]
PhysicalScan [{'<root>': p0}, c_808eb2a6-144a-4226-a997-e279bb2cc7c3]
 
********* Optimized ABT *********

It looks like the relevant rewrites are:

  • 'class NotPushdown', which includes:
    • pushing down Not through PathComposeA disjunction (DeMorgan's law)
    • combining Not Eq into Neq
  • 'struct SubstituteConvert<FilterNode>', which splits a PathComposeM conjunction into a sequence of two Filter stages.

This must be happening due to confusion around Nothing vs False. Missing fields are represented as Nothing, and most operations on Nothing return Nothing. I think even UnaryOp Not preserves Nothing. But at some point we coerce Nothing to False: this either happens in the Filter stage or in the EvalFilter expression.

I would probably first try disabling the above rewrites and see whether that makes the result correct. Then we can decide what changes are necessary to the rewrites, or to the initial ABT translation.



 Comments   
Comment by Hana Pearlman [ 31/Jul/23 ]

The initial ABT translation looks correct to me. It's the EvalFilter expression that coerces Nothing to False, so in the translated ABT the path below the EvalFilter will return Nothing, the EvalFilter will coerce this to False, and the UnaryOp [Not] will return True.

The same applies to the first Filter node in the ABT post-optimization. The second Filter node looks like the problem to me. I expect that the PathCompare [Neq] returns Nothing, which the EvalFilter coerces to False.

I am not very familiar with the NotPushdown rewrites, but it seems to me that the "combining Not Eq into Neq" step is causing the issue. UnaryOp [Not] above EvalFilter with PathCompare [eq] returns True when the path returns Nothing. EvalFilter with PathCompare [neq] returns False when the path returns Nothing.

It doesn't seem to me that this issue is specific to arrays, even though we are only seeing it with array comparisons currently. (edit: woops, that's what the last comment shows)

Comment by David Percy [ 21/Jul/23 ]

I found one more way to trigger this:

> db.c.drop()
true
> db.c.insert({_id: 5})
WriteResult({ "nInserted" : 1 })
> db.c.find({no_such_field: {$eq: 123}})
// empty, as expected
 
> db.c.find({no_such_field: {$not: {$eq: 123}}})
{ "_id" : 5 }
// correct
 
> db.c.createIndex({no_such_field:1})
{
        "numIndexesBefore" : 1,
        "numIndexesAfter" : 2,
        "createdCollectionAutomatically" : false,
        "ok" : 1
}
> db.c.find({no_such_field: {$not: {$eq: 123}}})
// empty -- incorrect

This example also involves combining Not Eq into Neq. That rewrite can only happen when there's no implicit array traversal, which is why the rewrite only happen when the index exists--the index metadata lets us eliminate the array traversal. This rewrite also happens with whole-array comparison: one of the two disjuncts doesn't have any implicit array traversal.

Generated at Thu Feb 08 06:40:21 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.