[SERVER-73888] Aggregation Pipeline: $unwind always gives an empty result with nested fields Created: 10/Feb/23  Updated: 10/Feb/23  Resolved: 10/Feb/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Blocker - P1
Reporter: Johnny Shields Assignee: Unassigned
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates SERVER-59713 `$unwind` doesn't work on array subdo... Investigating
Operating System: ALL
Steps To Reproduce:

Given the following documents:
 
 

{ addresses: [{ city: 'Chicago' }, { city: 'London' }] }
{ addresses: [{ city: 'Seattle' }] }
{ addresses: [{ city: 'Chicago' }, { city: 'Seattle' }] }

 
 
And then I call $unwind in an aggregation pipeline:
 
 

{ "$unwind": "$addresses.city" }
{ "$group": { _id: "$addresses.city", counts: { "$sum": 1 } } }
 
=> {}

 
It will yield an empty result.
 
As a workaround, if I add a $project it will work:
 

{ "$project": { "p" => "$addresses.city" } }
{ "$unwind": "$p" }
{ "$group": { _id: "$p", counts: { "$sum": 1 } } }
 
=>
{ _id: "Chicago", counts: 2 }
{ _id: "London", counts: 1 }
{ _id: "Seattle", counts: 2 }

 
 

Participants:

 Description   

When using the AggregationPipeline, $unwind always gives an empty result with nested fields
 
For example:

{ "$unwind": "$addresses.city" }

Adding a $project stage will fix the issue:

{ "$project": { "p" => "$addresses.city" } }
{ "$unwind": "$p" }

Refer to Steps to Reproduce below for details.



 Comments   
Comment by Johnny Shields [ 10/Feb/23 ]

As noted above:

  • preserveNullAndEmptyArrays = false --> result is empty
  • preserveNullAndEmptyArrays = true --> result is incorrect, same as if you hadn't done `$unwind` at all.

In your result with preserveNullAndEmptyArrays = true, you get "_id" as an array like this:

 "_id": [
      "Chicago",
      "London"
    ],

it should be:

 "_id": "Chicago"
 "_id": "London"

in two separate rows, i.e. "$unwind"ed.

Anyway, its fine to close this ticket as a dupe of SERVER-59713.

Comment by Alex Bevilacqua [ 10/Feb/23 ]

shields@tablecheck.com I apparently jumped the gun on the repro as I was testing to ensure the pipeline didn't return an empty result:

db.foo.drop();
db.foo.insertMany([
  { addresses: [{ city: 'Chicago' }, { city: 'London' }] },
  { addresses: [{ city: 'Seattle' }] },
  { addresses: [{ city: 'Chicago' }, { city: 'Seattle' }] }
]);
db.foo.aggregate([
  { "$project": { "p": "$addresses.city" } },
  { "$unwind": "$p" },
  { "$group": { _id: "$p", counts: { "$sum": 1 } } }
])
/*
[
  {
    "_id": "Chicago",
    "counts": 2
  },
  {
    "_id": "London",
    "counts": 1
  },
  {
    "_id": "Seattle",
    "counts": 2
  }
]
*/
db.foo.aggregate([
  { "$unwind": { path: "$addresses.city", preserveNullAndEmptyArrays: true } },
  { "$group": { _id: "$addresses.city", counts: { "$sum": 1 } } }
])
/*
[
  {
    "_id": [
      "Chicago",
      "London"
    ],
    "counts": 1
  },
  {
    "_id": [
      "Seattle"
    ],
    "counts": 1
  },
  {
    "_id": [
      "Chicago",
      "Seattle"
    ],
    "counts": 1
  }
]
*/
 

Comment by Johnny Shields [ 10/Feb/23 ]

You are correct that this is a duplicate of SERVER-59713

Comment by Johnny Shields [ 10/Feb/23 ]

Alex, I have tried this and it does not work.

  • According to the docs:

    { "$unwind": { path: "$addresses.city" } } and

  • { "$unwind": "$addresses.city" }

    are equivalent (the latter is an alias for the former).

  • The `preserveNullAndEmptyArrays` arg does not have the desired effect. If it is false, the $unwind yields an empty result, and if it is true, it effectively bypasses the $unwind and yields the non-unwinded arrays. (The behavior of preserveNullAndEmptyArrays in the nested fields case should be looked at within the scope of this ticket, it buggy/unintended as well.)

Please post the result of your aggregation.

Comment by Alex Bevilacqua [ 10/Feb/23 ]

shields@tablecheck.com I think this actually works as designed. You can pass the path when you're unwinding an array field path to produce the desired result you've described without needing the $project to a temporary field first.

For example:

db.foo.drop();
db.foo.insertMany([
  { addresses: [{ city: 'Chicago' }, { city: 'London' }] },
  { addresses: [{ city: 'Seattle' }] },
  { addresses: [{ city: 'Chicago' }, { city: 'Seattle' }] }
]);
db.foo.aggregate([
  { "$unwind": { path: "$addresses.city", preserveNullAndEmptyArrays: true } },
  { "$group": { _id: "$addresses.city", counts: { "$sum": 1 } } }
])

I'm going to close this ticket out as it appears to duplicate SERVER-59713 which describes a similar issue.

Comment by Johnny Shields [ 10/Feb/23 ]

Note that other operators such as $group handle nested.fields fine, so I consider this to be a possible bug.

Generated at Thu Feb 08 06:25:55 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.