[SERVER-71508] Accessing nested array variable in $lookup+$let+$expr seems undefined Created: 20/Nov/22  Updated: 27/Oct/23  Resolved: 09/Aug/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 4.4.7, 6.0.3
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Igor Nem Assignee: Backlog - Query Execution
Resolution: Works as Designed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File repro.js     File test.device.json     File test.machine.json    
Issue Links:
Gantt End to End
has to be finished together with SERVER-72157 Issue SERVER-71508 was closed with wr... Closed
Related
is related to SERVER-72157 Issue SERVER-71508 was closed with wr... Closed
Assigned Teams:
Query Execution
Operating System: ALL
Steps To Reproduce:

Create collection "machine"  with data

{"_id":"qqq","lines":[{"devices":{"type-c":{"disabled":false}}}]} 

Create collection device with data

{"machineId":"qqq","deviceEnabled":false}
{"machineId":"qqq","deviceEnabled":false} 

Execute aggregation

db.machine.aggregate([    {
        $lookup: {
            'from': 'device',
            let: {mId: '$_id', m_device_disabled: "$lines.devices.type-c.disabled"},
            as: 'res',
            pipeline: [
                {
                    $match:
                    {
                        $expr:
                        {
                            $and:
                                [
                                    { $eq: ['$$mId', '$machineId'] }
                                   ,{ $eq: [{$toBool: '$$m_device_disabled'},{$toBool: "$deviceEnabled"}] }
                                ]
                        }
                    }
                }]
        }
    }
    ,   {
         $project: {
           'res.deviceEnabled': 1
           'res._id': 1
           ,'_id': 1
         }
     }
    ])
    
    

Expected result:

{
    "_id" : "qqq",
    "res" : [
        {
            "_id" : ObjectId("6371fcc7ce3e296eed4c2b0c"),
            "deviceEnabled" : false
        },
        {
            "_id" : ObjectId("6371fcc7ce3e296eed4c2b32"),
            "deviceEnabled" : false
        }
    ]
} 

Actual result:

{
    "_id" : "qqq",
    "res" : [ ]
} 

Participants:

 Description   

I'm trying to make a join of two colections by multiple fields, one of them is boolean and in case when this field is "false"



 Comments   
Comment by Edwin Zhou [ 09/Aug/23 ]

Hi igornem1@gmail.com

After deeper investigation, this behavior is expected and works as designed. The "$lines.devices.type-c.disabled" field resolves to [false], and {$toBool: [false]} resolves to true since it is an array. We can demonstrate this by using $let to show that m_device_enabled is an array containing false

db.machine.aggregate([
    {
      $project: {
        m_device_enabled: {
          $let: {
            vars: { item: "$lines.devices.type-c.disabled" },
            in: "$$item",
          },
        },
      },
    },
  ])

[ { "_id" : "qqq", "m_device_enabled" : [ false ] } ]

This is defined behavior, and you will need to use $arrayElemAt, more concisely $first, or do an $unwind if there are multiple elements in lines.

Kind regards,
Edwin

Comment by Edwin Zhou [ 25/Jul/23 ]

Hi igornem1@gmail.com

Thank you for your patience while we investigate this issue—investigating problematic queries can be difficult. The recommendations that my colleagues have provided point to the undefined behavior you noted in your summary: $let on an array in $lookup produces an undefined behavior.

I think the behavior can be clearly demonstrated if we simplify your example a bit and add a new field with similar paths:

machine_coll.insert({
  _id: "qqq",
  lines: [{nested_field: false}],
  field: {nested_field: false}
});
 
machine_coll
  .aggregate([
    {
      $lookup: {
        from: "device",
        let: {
          mId: "$_id",
          m_device_disabled: "$lines.nested_field", // array field
        },
        as: "res",
        pipeline: [
          ...
        ],
      },
    }
  ]).toArray();
 
machine_coll
  .aggregate([
    {
      $lookup: {
        from: "device",
        let: {
          mId: "$_id",
          m_device_disabled: "$field.nested_field", // subdocument field
        },
        as: "res",
        pipeline: [
          ...
        ],
      },
    }
  ]).toArray();

Each aggregation respectively produces

// $let on subarray
[
	{
		"_id" : "qqq",
		"lines" : [
			{
				"nested_field" : false
			}
		],
		"field" : {
			"nested_field" : false
		},
		"res" : [ ]
	}
]
 
// $let on subdocument
[
	{
		"_id" : "qqq",
		"lines" : [
			{
				"nested_field" : false
			}
		],
		"field" : {
			"nested_field" : false
		},
		"res" : [
			{
				"_id" : ObjectId("64bfc0b7f492d2b761998b9d"),
				"machineId" : "qqq",
				"deviceEnabled" : false
			},
			{
				"_id" : ObjectId("64bfc0b7f492d2b761998b9e"),
				"machineId" : "qqq",
				"deviceEnabled" : false
			}
		]
	}
]

I do agree with my colleagues that using $addFields with $arrayElemAt to reduce the subarray into a subdocument, or $unwind to produce a set of documents before using $lookup is a way to work around this issue. I hope that you consider the solutions posed in the referenced forum post, or that you engage the MongoDB Developer Community Forums if you need additional help.

Given that this behavior seems undefined, I'd like to pass this ticket over to the appropriate team to further investigate whether this behavior is intended.

Kind regards,
Edwin

Comment by Igor Nem [ 16/Feb/23 ]

Hi, are any news or plans on it, please?

Comment by Igor Nem [ 16/Dec/22 ]

Hi Eric,

  1. I understood perfectly what your colleague Yuan mentioned, just since it not relevant to this issue I don't see any reason mention it and also don't understand why you mention it now.
  2. I have to say, that bechavour of my query is same no matter used $toBool or not, it work same without $toBool too, just it presenting in my example because I did different tests and at last wanted to be sure that DB recognize certainly that the value is of boolean type, but it didn't make any difference. And in accordance to other types(at least which I used) I didn't face some problem.
  3. As issue https://jira.mongodb.org/browse/SERVER-30812 is not relevant to this one, please don't mention it to me unlesst it will have some. Currently it creates an unnecessary mess.

Thank you

 

Comment by Eric Sedor [ 16/Dec/22 ]

Hi igornem1@gmail.com,

Apologies for any confusion and thanks for writing back.

What my colleague Yuan is pointing to is the lines array in the machine document. If you use $arrayElemAt (or $unwind if logically necessary) to extract the lines documents from the array prior to $lookup (as described at the referenced forum post) you can work around the problem you are seeing.

But as we look more we do think there could be an issue with the $expr clause referencing the "$$m_device_disabled" variable from the $let clause. It does not seem related to $toBool as far as we can tell so far. Have you seen a case where it works for other data types or in other cases?

We wonder if something similar to SERVER-30812 might be going on whereby the array is not properly accessible. We are going to investigate some more.

Eric

Comment by Igor Nem [ 15/Dec/22 ]

Hi @Yuan Fang, actually in my query there is no array field therefore the link you posted and your guess are not relevant.
Moreover, as mentioned in solution on link which you posted I used correct syntax: $match -> $expr -> $and ->$eq
Thank you

Comment by Yuan Fang [ 15/Dec/22 ]

Hi igornem1@gmail.com,

Thank you for reporting the issue. It looks like this thread on the forum discusses getting an element in an array in the aggregation pipeline. Can you see if it helps?

I quickly glanced at the aggregation pipeline you provided, it seems there is an ambiguity in the $let expression getting the "disabled" field in the machine collection. Thus, I think it is the expected behavior when trying to use $let to address an array field.

For any further issues like this, we'd like to encourage you to start by asking our community for help by posting on the MongoDB Developer Community Forums. If the discussion there leads you to suspect a bug in the MongoDB server, then we'd want to investigate it as a possible bug here in the SERVER project.

Regards,
Yuan

Generated at Thu Feb 08 06:19:12 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.