[SERVER-64102] $project field that references time-series meta field can be referenced by second $project field Created: 01/Mar/22  Updated: 29/Oct/23  Resolved: 10/Mar/22

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 6.0.0, 5.0.5, 5.2.1, 5.3.0-rc2
Fix Version/s: 5.3.2, 6.0.0-rc0, 5.0.7

Type: Bug Priority: Major - P3
Reporter: James Wahlin Assignee: James Wahlin
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Related
related to SERVER-66570 Timeseries pushes down metaField-proj... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v5.3, v5.2, v5.0
Steps To Reproduce:

db.dropDatabase();
db.createCollection("tsColl", {timeseries: {timeField: 'time', metaField: 'x'}});
 
var doc = {time: new Date("2019-10-11T14:39:18.670Z"), x: 5};
 
db.tsColl.insert(doc);
db.regColl.insert(doc);
 
var pipeline = [{$project: {_id: 0, a: "$x", b: "$a"}}];
var tsDoc = db.tsColl.aggregate(pipeline).toArray();
var regDoc = db.regColl.aggregate(pipeline).toArray();
assert.eq(tsDoc,regDoc);

The above produces the following error:

Error: [[ { "a" : 5, "b" : 5 } ]] != [[ { "a" : 5 } ]] are not equal

Sprint: QO 2022-03-07, QO 2022-03-21
Participants:
Linked BF Score: 122

 Description   

See included reproducer. In a $project stage like {$project: {a: "$meta", b: "$a"}}, the "$a" expression should only be able to reference a value in the original document passed to the $project stage. For time-series collections, if it references the name of a different projection field, where that field references the meta field value, then it is populated with the metadata value as well.



 Comments   
Comment by James Wahlin [ 06/Apr/22 ]

Author:

{'name': 'James Wahlin', 'email': 'james@mongodb.com', 'username': 'jameswahlin'}

Message: SERVER-63010 Ensure that unpacking measurements doesn't overwrite pushed down addFields that are computed on meta data

(cherry picked from commit 3a8cf3d2d1c6f668607756ab0b972f9ca2148f18)
Branch: v5.0
https://github.com/mongodb/mongo/commit/ba218f4a7fd63bf898cef934be794a4cdb8b0c51

Comment by Githook User [ 05/Apr/22 ]

Author:

{'name': 'James Wahlin', 'email': 'james@mongodb.com', 'username': 'jameswahlin'}

Message: SERVER-64102 Ensure that unpacking measurements doesn't overwrite pushedown addFields that are computed on meta data

(cherry picked from commit 3a8cf3d2d1c6f668607756ab0b972f9ca2148f18)
Branch: v5.3
https://github.com/mongodb/mongo/commit/04f9e187c1ea2c2f5240c324b40c01712d4370a2

Comment by James Wahlin [ 05/Apr/22 ]

commit 3a8cf3d2d1c6f668607756ab0b972f9ca2148f18

Author: James Wahlin <james@mongodb.com>

Date:   Wed Mar 2 15:15:05 2022 -0500

    SERVER-63010 Ensure that unpacking measurements doesn't overwrite pushed down addFields that are computed on meta data

Comment by James Wahlin [ 02/Mar/22 ]

This appears to be caused by an incorrect pipeline rewrite. The following is the pipeline we generate for the reproducer. In the following we split the {$project: {_id: 0, a: "$x", b: "$a"}} into pre-unpack $addFields for "a" and a post-unpack $project that references "$a". The pre-unpack $addFields addition leads to the "a" with the $meta value being part of the document that the $project consumes and "a" is treated like a normal field. 

To fix this we should prevent the pushdown of the $addFields, when the field produced is referenced elsewhere in the $project. 

	"stages" : [
		{
			"$cursor" : {
				"queryPlanner" : {
					...
				}
			}
		},
		{
			"$addFields" : {
				"a" : "$meta"
			}
		},
		{
			"$_internalUnpackBucket" : {
				"include" : [
					"a"
				],
				"timeField" : "time",
				"metaField" : "x",
				"bucketMaxSpanSeconds" : 3600,
				"computedMetaProjFields" : [
					"a"
				]
			}
		},
		{
			"$project" : {
				"a" : true,
				"b" : "$a",
				"_id" : false
			}
		}
	],

Generated at Thu Feb 08 05:59:30 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.