We recently introduced the $getField expression to provide an alternative way to extract fields from objects by their key name. See SERVER-30417.
Imagine you have a document where some field "foo" is missing. In this case, both the "$foo" field path expression and the corresponding $getField expression will return "missing":
MongoDB Enterprise > db.c.drop() true MongoDB Enterprise > db.c.insert({}) WriteResult({ "nInserted" : 1 }) MongoDB Enterprise > db.c.aggregate([{$project: {result: "$foo"}}]) { "_id" : ObjectId("60d10a22b614644d4c0f1a31") } MongoDB Enterprise > db.c.aggregate([{$project: {result: {$getField: "foo"}}}]) { "_id" : ObjectId("60d10a22b614644d4c0f1a31") }
You can see that because both expressions return "missing", no field named "result" appears in the resulting document.
Let's now consider a similar example where the user is attempting to extract a field "bar" from object "foo". They can do so either with the dotted field path expression "$foo.bar", or with a chain of nested $getField expressions. However, the field path expression returns missing whereas the nested $getField expressions return null:
MongoDB Enterprise > db.c.aggregate([{$project: {result: "$foo.bar"}}]) { "_id" : ObjectId("60d10a22b614644d4c0f1a31") } MongoDB Enterprise > db.c.aggregate([{$project: {result: {$getField: {field: "bar", input: {$getField: "foo"}}}}}]) { "_id" : ObjectId("60d10a22b614644d4c0f1a31"), "result" : null }
The reason for this behavior is that MQL expressions generally return null when any of their inputs are either null, missing, or undefined. In the case of $getField, it will return null when the input argument is null, missing, or undefined. (The "field" argument, on the other hand, must always be a string literal, which is validated at parse time.) Furthermore, MQL expressions other than field path expressions generally do not return missing. However, $getField has a special case to return missing in order to ensure that it is analogous to a field path expression.
The problem here is that this analogous behavior breaks down for dotted field paths. That is, a missing dotted field path will return null rather than missing if rewritten as a chain of nested $getField expressions. For this reason, we should consider changing the behavior of $getField so that it returns missing rather than null if the value of the "input" expression evaluates to missing, null, or undefined.
There is a similar problem if a scalar exists along a dotted path:
MongoDB Enterprise > db.c.find() { "_id" : ObjectId("60d10cbcb614644d4c0f1a32"), "foo" : 1 } MongoDB Enterprise > db.c.aggregate([{$project: {result: "$foo.bar"}}]) { "_id" : ObjectId("60d10cbcb614644d4c0f1a32") } MongoDB Enterprise > db.c.aggregate([{$project: {result: {$getField: {field: "bar", input: {$getField: "foo"}}}}}]) uncaught exception: Error: command failed: { "ok" : 0, "errmsg" : "PlanExecutor error during aggregation :: caused by :: $getField requires 'input' to evaluate to type Object, but got double", "code" : 3041705, "codeName" : "Location3041705" } with original command request: { "aggregate" : "c", "pipeline" : [ { "$project" : { "result" : { "$getField" : { "field" : "bar", "input" : { "$getField" : "foo" } } } } } ], "cursor" : { }, "lsid" : { "id" : UUID("5de90c09-a31d-45f7-a1bf-53d307de419f") } } on connection: connection to 127.0.0.1:27017 : aggregate failed : _getErrorWithCode@src/mongo/shell/utils.js:25:13 doassert@src/mongo/shell/assert.js:18:14 _assertCommandWorked@src/mongo/shell/assert.js:731:17 assert.commandWorked@src/mongo/shell/assert.js:823:16 DB.prototype._runAggregate@src/mongo/shell/db.js:276:5 DBCollection.prototype.aggregate@src/mongo/shell/collection.js:1058:12 @(shell):1:1
The field path expression will return missing whereas the $getField version will throw an exception. This means that $getField would also have to return missing if "input" is any non-object type.
It's not obvious whether this suggested change is a good idea or not. It depends on whether we want $getField to act like all other MQL expressions, or if it should inherit the special behaviors of field path expressions.
Shout out to matthew.chiaravalloti for bringing this to our attention!
- design is described in
-
SERVER-94854 $getField accepts all input types and treats all non-object ones as null-ish
- Closed
- is related to
-
SERVER-30417 add expression to get value by keyname from object
- Closed
- related to
-
SERVER-58076 Exclude new language features from stable API for 1 quarter
- Closed
-
SERVER-95857 Eliminate $getField special case for null/undefined inputs
- Needs Scheduling