Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-95857

Eliminate $getField special case for null/undefined inputs

    • Query Optimization

      In SERVER-57914, we changed the $getField expression so that is generally returns "missing" rather than throwing an exception or returning null when the input is not an object. However, when examining the code changes for SERVER-57914 I noticed that the behavior we actually implement is as follows:

      • If the input is an object, try to extract the field. Return the value of the field if it exists, or missing if the field does not exist.
      • If the input is null or undefined, return null.
      • Otherwise, return missing.

      I find the behavior in bullet #2 above surprising. While it is generally true that attempting to evaluate an expression in MQL over a null input will return null, we chose in SERVER-57914 to make $getField special. Returning missing when the field cannot be extracted is what field path expressions should do, so our aim was to make $getField work similarly to a standard field path expression. By this logic, attempting to extract a field from a null scalar should also return missing.

      Here's a concrete example of this special case in practice:

      MongoDB Enterprise > db.c.find()
      { "_id" : ObjectId("670ee3aea74047218a8baf9b"), "a" : "scalar" }
      { "_id" : ObjectId("670ee519a74047218a8baf9d"), "a" : null }
      
      // When using a field path expression, the "result" field is missing for both input
      // documents.
      MongoDB Enterprise > db.c.aggregate([{$project: {result: "$a.b"}}])
      { "_id" : ObjectId("670ee3aea74047218a8baf9b") }
      { "_id" : ObjectId("670ee519a74047218a8baf9d") }
      
      // When expressing a similar thing by chaining $getField expressions, null doesn't
      // behave the same way as other scalar types.
      MongoDB Enterprise > db.c.aggregate([{$project: {result: {$getField: {field: "b", input: {$getField: "a"}}}}}])
      { "_id" : ObjectId("670ee3aea74047218a8baf9b") }
      { "_id" : ObjectId("670ee519a74047218a8baf9d"), "result" : null }
      

      We should consider eliminating this special case, since I am not aware of any rationale for why it should work this way – perhaps it was an implementation error? That said, this would be an intentional behavioral change to the MQL language which is potentially backwards breaking, so we would have to proceed with caution and avoid any backports to stable branches. If we decide to accept this behavior out of fear of breaking applications, I suggest that we document this in something like an "MQL errata" (or known MQL problems) page.

            Assignee:
            colby.ing@mongodb.com Colby Ing
            Reporter:
            david.storch@mongodb.com David Storch
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated: