Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-79205

[CQF] $not $eq array, on a missing field, should return true

    XMLWordPrintableJSON

Details

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major - P3 Major - P3
    • None
    • None
    • None
    • Query Optimization
    • ALL
    • QO 2023-08-21

    Description

      This is something the fuzzer discovered.

      Example collection and queries on classic:

      > db1.c.find()
      { "_id" : 5 }
       
      > db1.c.find({array: {$eq: ['a']}})
      // empty results
       
      > db1.c.find({array: {$not: {$eq: ['a']}}})
      { "_id" : 5 }
      

      Same example with forceBonsai:

      > db2.c.find()
      { "_id" : 5 }
       
      > db2.c.find({array: {$eq: ['a']}})
      // empty results
       
      > db2.c.find({array: {$not: {$eq: ['a']}}})
      // empty results -- incorrect
      

      This only happens when the constant is an array. For example this query is correct:

      > db1.c.find({array: {$not: {$eq: 'a'}}})
      { "_id" : 5 }
      > db2.c.find({array: {$not: {$eq: 'a'}}})
      { "_id" : 5 }
      

      The Bonsai plan before optimization is:

      ********* Translated ABT *********
      explain : Root [{p0}]
      Filter []
      |   EvalFilter []
      |   |   Variable [p0]
      |   PathConstant [] UnaryOp [Not] EvalFilter []
      |   |   Variable [p0]
      |   PathGet [array] PathComposeA []
      |   |   PathCompare [Eq] Const [["a"]]
      |   PathTraverse [1] PathCompare [Eq] Const [["a"]]
      Scan [c_808eb2a6-144a-4226-a997-e279bb2cc7c3, {p0}]
       
      ********* Translated ABT *********
      

      and after optimization:

      ********* Optimized ABT *********
      explain :
      Root [{p0}]
      Filter []
      |   EvalFilter []
      |   |   Variable [p0]
      |   PathGet [array] PathCompare [Neq] Const [["a"]]
      Filter []
      |   EvalFilter []
      |   |   Variable [p0]
      |   PathGet [array] PathLambda [] LambdaAbstraction [p2] UnaryOp [Not] EvalFilter []
      |   |   Variable [p2]
      |   PathTraverse [1] PathCompare [Eq] Const [["a"]]
      PhysicalScan [{'<root>': p0}, c_808eb2a6-144a-4226-a997-e279bb2cc7c3]
       
      ********* Optimized ABT *********
      

      It looks like the relevant rewrites are:

      • 'class NotPushdown', which includes:
        • pushing down Not through PathComposeA disjunction (DeMorgan's law)
        • combining Not Eq into Neq
      • 'struct SubstituteConvert<FilterNode>', which splits a PathComposeM conjunction into a sequence of two Filter stages.

      This must be happening due to confusion around Nothing vs False. Missing fields are represented as Nothing, and most operations on Nothing return Nothing. I think even UnaryOp Not preserves Nothing. But at some point we coerce Nothing to False: this either happens in the Filter stage or in the EvalFilter expression.

      I would probably first try disabling the above rewrites and see whether that makes the result correct. Then we can decide what changes are necessary to the rewrites, or to the initial ABT translation.

      Attachments

        Activity

          People

            backlog-query-optimization Backlog - Query Optimization
            david.percy@mongodb.com David Percy
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated: