[SERVER-30812] When using an array element as the local field for $lookup, $project doesn't work Created: 24/Aug/17  Updated: 08/Sep/23  Resolved: 08/Sep/23

Status: Closed
Project: Core Server
Component/s: Aggregation Framework
Affects Version/s: 3.4.9, 3.6.0-rc0, 6.3.2, 6.0.10, 5.0.21
Fix Version/s: None

Type: Bug Priority: Critical - P2
Reporter: Damian Mastylo Assignee: Alexander Ignatyev
Resolution: Done Votes: 2
Labels: asya, mql-semantics, query-44-grooming
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File SERVER-30812.js    
Issue Links:
Related
related to SERVER-26066 $project should add top-level field t... Closed
is related to SERVER-25200 $project computed fields on arrays se... Closed
Assigned Teams:
Query Optimization
Backwards Compatibility: Fully Compatible
Operating System: ALL
Steps To Reproduce:

db.venues.aggregate(
 
	// Pipeline
	[
		// Stage 1
		{
			$lookup: {
			    "from" : "chains",
			    "localField" : "chainIds.0",
			    "foreignField" : "_id",
			    "as" : "venueChain"
			}
		},
 
		// Stage 2
		{
			$addFields: {
			  "chainName": "$venueChain.name"
			}
		},
 
		// Stage 3
		{
			$project: {
			    n: 1,
			    chainName: 1
			}
		},
	],
 
	// Options
	{
		cursor: {
			batchSize: 50
		}
	}
 
);

Sprint: QO 2023-08-21, QO 2023-09-04, QO 2023-09-18
Participants:

 Description   

If an array element is used as the localField for the $lookup aggregation pipe, a $project anywhere else in the pipeline will fail to return results and also take a while to run. There is a workaround by adding a field containing the local element and using that field as the localField:

db.venues.aggregate(
 
	// Pipeline
	[
		// Stage 1
		{
			$addFields: {
			  "firstChainId": {
			    "$arrayElemAt": [ "$chainIds", 0 ]
			  }
			}
		},
 
		// Stage 2
		{
			$lookup: {
			    "from" : "chains",
			    "localField" : "firstChainId",
			    "foreignField" : "_id",
			    "as" : "venueChain"
			}
		},
 
		// Stage 3
		{
			$addFields: {
			  "chainName": "$venueChain.name"
			}
		},
 
		// Stage 4
		{
			$project: {
			  n: 1,
			  chainName: 1
			}
		},
	],
 
	// Options
	{
		cursor: {
			batchSize: 50
		}
	}
);



 Comments   
Comment by Alexander Ignatyev [ 08/Sep/23 ]

The investigation revealed that the ticket had been fixed by SERVER-75101.

Comment by Alexander Ignatyev [ 07/Sep/23 ]

kateryna.kamenieva@mongodb.com, the issue was fixed in SERVER-75101, commit 9f160dea3f2d311efb8f981c239e717b79a09f65

Comment by David Storch [ 27/Nov/17 ]

asya, yeah, it looks pretty closely related. I'd rather keep both tickets open for the time being until we have determined a clear path to fix. As I mentioned above, I can imagine fixing this either by treating this as a bug in projection semantics or as a bug in dependency analysis.

Comment by Asya Kamsky [ 23/Nov/17 ]

Is this just an instance of SERVER-26066?

Comment by Asya Kamsky [ 23/Nov/17 ]

It looks like there may be other workarounds to the issue.

One is if there is an explicit $project before $lookup for the array field

db.coll.aggregate([{$project:{x:1}},{$lookup: {from: "coll", as: "as", localField: "x.0", foreignField: "x.0"}}, {$addFields: {z: "$as.x.y"}}, {$project: {z: 1}}])
{ "_id" : ObjectId("5a17305db5843981b6727de7"), "z" : [ [ 1, 2 ] ] }

Comment by David Storch [ 24/Oct/17 ]

This looks like a bug having to do with aggregation's dependency analysis, which attempts to compute the projection that appears in the explain output as $cursor.fields. The first query that James posted above returns the incorrect results because "x.0" is computed as a dependency. However, this projection causes the query subsystem to incorrectly exclude some of the data required to execute the $lookup:

> db.coll.find({}, {"as.x.y": 1, "x.0": 1, "z": 1, "_id": 1})
{ "_id" : ObjectId("59ef461bce8432e1c921c516"), "x" : [ {  }, {  } ] }

I can imagine a few ways to fix this:

  • Change the behavior of the "x.0" projection above, since its behavior is arguably a bug.
  • Change the dependency analysis system to avoid pushing projections involving numerical path components down to the $cursor stage.
Comment by James Wahlin [ 24/Oct/17 ]

Reopening as I can reproduce this issue locally:

> db.coll.insert({x: [{y: 1}, {y: 2}]})
WriteResult({ "nInserted" : 1 })
> db.coll.aggregate([{$lookup: {from: "coll", as: "as", localField: "x.0", foreignField: "x.0"}}, {$addFields: {z: "$as.x.y"}}, {$project: {z: 1}}])
{ "_id" : ObjectId("59ef3e7d99f6610e714ff562"), "z" : [ ] }
> db.coll.aggregate([{$addFields: {a: {$arrayElemAt: ["$x", 0]}}},{$lookup: {from: "coll", as: "as", localField: "a", foreignField: "x.0"}}, {$addFields: {z: "$as.x.y"}}, {$project: {z: 1}}])
{ "_id" : ObjectId("59ef3e7d99f6610e714ff562"), "z" : [ [ 1, 2 ] ] }

Below is the explain for both runs:

> db.coll.explain().aggregate([{$lookup: {from: "coll", as: "as", localField: "x.0", foreignField: "x.0"}}, {$addFields: {z: "$as.x.y"}}, {$project: {z: 1}}])
{
	"stages" : [
		{
			"$cursor" : {
				"query" : {
					
				},
				"fields" : {
					"as.x.y" : 1,
					"x.0" : 1,
					"z" : 1,
					"_id" : 1
				},
				"queryPlanner" : {
					"plannerVersion" : 1,
					"namespace" : "test.coll",
					"indexFilterSet" : false,
					"parsedQuery" : {
						
					},
					"winningPlan" : {
						"stage" : "COLLSCAN",
						"direction" : "forward"
					},
					"rejectedPlans" : [ ]
				}
			}
		},
		{
			"$lookup" : {
				"from" : "coll",
				"as" : "as",
				"localField" : "x.0",
				"foreignField" : "x.0"
			}
		},
		{
			"$addFields" : {
				"z" : "$as.x.y"
			}
		},
		{
			"$project" : {
				"_id" : true,
				"z" : true
			}
		}
	],
	"ok" : 1
}
>
> db.coll.explain().aggregate([{$addFields: {a: {$arrayElemAt: ["$x", 0]}}},{$lookup: {from: "coll", as: "as", localField: "a", foreignField: "x.0"}}, {$addFields: {z: "$as.x.y"}}, {$project: {z: 1}}])
{
	"stages" : [
		{
			"$cursor" : {
				"query" : {
					
				},
				"fields" : {
					"a" : 1,
					"as.x.y" : 1,
					"x" : 1,
					"z" : 1,
					"_id" : 1
				},
				"queryPlanner" : {
					"plannerVersion" : 1,
					"namespace" : "test.coll",
					"indexFilterSet" : false,
					"parsedQuery" : {
						
					},
					"winningPlan" : {
						"stage" : "COLLSCAN",
						"direction" : "forward"
					},
					"rejectedPlans" : [ ]
				}
			}
		},
		{
			"$addFields" : {
				"a" : {
					"$arrayElemAt" : [
						"$x",
						{
							"$const" : 0
						}
					]
				}
			}
		},
		{
			"$lookup" : {
				"from" : "coll",
				"as" : "as",
				"localField" : "a",
				"foreignField" : "x.0"
			}
		},
		{
			"$addFields" : {
				"z" : "$as.x.y"
			}
		},
		{
			"$project" : {
				"_id" : true,
				"z" : true
			}
		}
	],
	"ok" : 1
}

Comment by Mark Agarunov [ 20/Oct/17 ]

Hello damian@liveapp.com,

We haven’t heard back from you for some time, so I’m going to mark this ticket as resolved. If this is still an issue for you, please provide additional information and we will reopen the ticket.

Thanks,
Mark

Comment by Kelsey Schubert [ 29/Sep/17 ]

Hi damian@liveapp.com,

We still need additional information to diagnose the problem. If this is still an issue for you, would you please provide the output of the commands executed with the explain option?

Thank you,
Kelsey

Comment by Ramon Fernandez Marina [ 14/Sep/17 ]

Apologies for the delay in getting back to you damian@liveapp.com. Can you please run both commands with the explain option and send us the output?

Thanks,
Ramón.

Generated at Thu Feb 08 04:25:05 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.