[SERVER-57914] Make $getField return missing if "input" is missing or not an object Created: 21/Jun/21  Updated: 29/Oct/23  Resolved: 28/Jun/21

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 5.1.0-rc0

Type: Improvement Priority: Major - P3
Reporter: David Storch Assignee: Ruslan Abdulkhalikov (Inactive)
Resolution: Fixed Votes: 0
Labels: post-rc0
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Related
related to SERVER-58076 Exclude new language features from st... Closed
is related to SERVER-30417 add expression to get value by keynam... Closed
Backwards Compatibility: Fully Compatible
Backport Requested:
v5.0
Sprint: Query Optimization 2021-07-12
Participants:

 Description   

We recently introduced the $getField expression to provide an alternative way to extract fields from objects by their key name. See SERVER-30417.

Imagine you have a document where some field "foo" is missing. In this case, both the "$foo" field path expression and the corresponding $getField expression will return "missing":

MongoDB Enterprise > db.c.drop()
true
MongoDB Enterprise > db.c.insert({})
WriteResult({ "nInserted" : 1 })
MongoDB Enterprise > db.c.aggregate([{$project: {result: "$foo"}}])
{ "_id" : ObjectId("60d10a22b614644d4c0f1a31") }
MongoDB Enterprise > db.c.aggregate([{$project: {result: {$getField: "foo"}}}])
{ "_id" : ObjectId("60d10a22b614644d4c0f1a31") }

You can see that because both expressions return "missing", no field named "result" appears in the resulting document.

Let's now consider a similar example where the user is attempting to extract a field "bar" from object "foo". They can do so either with the dotted field path expression "$foo.bar", or with a chain of nested $getField expressions. However, the field path expression returns missing whereas the nested $getField expressions return null:

MongoDB Enterprise > db.c.aggregate([{$project: {result: "$foo.bar"}}])
{ "_id" : ObjectId("60d10a22b614644d4c0f1a31") }
MongoDB Enterprise > db.c.aggregate([{$project: {result: {$getField: {field: "bar", input: {$getField: "foo"}}}}}])
{ "_id" : ObjectId("60d10a22b614644d4c0f1a31"), "result" : null }

The reason for this behavior is that MQL expressions generally return null when any of their inputs are either null, missing, or undefined. In the case of $getField, it will return null when the input argument is null, missing, or undefined. (The "field" argument, on the other hand, must always be a string literal, which is validated at parse time.) Furthermore, MQL expressions other than field path expressions generally do not return missing. However, $getField has a special case to return missing in order to ensure that it is analogous to a field path expression.

The problem here is that this analogous behavior breaks down for dotted field paths. That is, a missing dotted field path will return null rather than missing if rewritten as a chain of nested $getField expressions. For this reason, we should consider changing the behavior of $getField so that it returns missing rather than null if the value of the "input" expression evaluates to missing, null, or undefined.

There is a similar problem if a scalar exists along a dotted path:

MongoDB Enterprise > db.c.find()
{ "_id" : ObjectId("60d10cbcb614644d4c0f1a32"), "foo" : 1 }
MongoDB Enterprise > db.c.aggregate([{$project: {result: "$foo.bar"}}])
{ "_id" : ObjectId("60d10cbcb614644d4c0f1a32") }
MongoDB Enterprise > db.c.aggregate([{$project: {result: {$getField: {field: "bar", input: {$getField: "foo"}}}}}])
uncaught exception: Error: command failed: {
	"ok" : 0,
	"errmsg" : "PlanExecutor error during aggregation :: caused by :: $getField requires 'input' to evaluate to type Object, but got double",
	"code" : 3041705,
	"codeName" : "Location3041705"
} with original command request: {
	"aggregate" : "c",
	"pipeline" : [
		{
			"$project" : {
				"result" : {
					"$getField" : {
						"field" : "bar",
						"input" : {
							"$getField" : "foo"
						}
					}
				}
			}
		}
	],
	"cursor" : {
 
	},
	"lsid" : {
		"id" : UUID("5de90c09-a31d-45f7-a1bf-53d307de419f")
	}
} on connection: connection to 127.0.0.1:27017 : aggregate failed :
_getErrorWithCode@src/mongo/shell/utils.js:25:13
doassert@src/mongo/shell/assert.js:18:14
_assertCommandWorked@src/mongo/shell/assert.js:731:17
assert.commandWorked@src/mongo/shell/assert.js:823:16
DB.prototype._runAggregate@src/mongo/shell/db.js:276:5
DBCollection.prototype.aggregate@src/mongo/shell/collection.js:1058:12
@(shell):1:1

The field path expression will return missing whereas the $getField version will throw an exception. This means that $getField would also have to return missing if "input" is any non-object type.

It's not obvious whether this suggested change is a good idea or not. It depends on whether we want $getField to act like all other MQL expressions, or if it should inherit the special behaviors of field path expressions.

Shout out to matthew.chiaravalloti for bringing this to our attention!



 Comments   
Comment by Vivian Ge (Inactive) [ 06/Oct/21 ]

Updating the fixversion since branching activities occurred yesterday. This ticket will be in rc0 when it’s been triggered. For more active release information, please keep an eye on #server-release. Thank you!

Comment by Githook User [ 28/Jun/21 ]

Author:

{'name': 'Ruslan Abdulkhalikov', 'email': 'ruslan.abdulkhalikov@mongodb.com', 'username': 'rusabd1'}

Message: SERVER-57914 make getField return missing values

(cherry picked from commit d3d1f3bfe78f39df042bed6bb31b80dbe0b479f8)
Branch: v5.0
https://github.com/mongodb/mongo/commit/751eb739c00c463b54823e7ab6da74c40172365f

Comment by Githook User [ 28/Jun/21 ]

Author:

{'name': 'Ruslan Abdulkhalikov', 'email': 'ruslan.abdulkhalikov@mongodb.com', 'username': 'rusabd1'}

Message: SERVER-57914 make getField return missing values
Branch: master
https://github.com/mongodb/mongo/commit/d3d1f3bfe78f39df042bed6bb31b80dbe0b479f8

Comment by Ruslan Abdulkhalikov (Inactive) [ 25/Jun/21 ]

https://mongodbcr.appspot.com/775370005/

Generated at Thu Feb 08 05:43:06 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.