[SERVER-12685] Expand $unwind behavior to include empty documents optionally Created: 11/Feb/14  Updated: 05/Feb/16  Resolved: 21/Sep/15

Status: Closed
Project: Core Server
Component/s: Aggregation Framework
Affects Version/s: None
Fix Version/s: 3.1.8

Type: Improvement Priority: Major - P3
Reporter: Ramesh Reddy Assignee: Charlie Swanson
Resolution: Done Votes: 3
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
is depended on by SERVER-19095 $lookup Closed
Related
related to SERVER-8088 $unwind of non-array should be allowed Closed
related to SERVER-20168 Add an option to $unwind to include d... Closed
Backwards Compatibility: Fully Compatible
Participants:

 Description   

Currently when a document has nested array based documents, $unwind only shows records where parent documents contain the $unwind field. It would be useful in certain scenarios to have an $unwind stage with some other optional configuration, which will produce an output document for a document that does not contain the $unwind field.

In a collection with the following documents:

{
	"FirstName" : "John",
	"LastName" : "Doe",
	"Notes" : [
		{
			"CustomerId" : DBRef("Customer", 1002),
			"PostDate" : ISODate("2014-02-05T00:04:45Z"),
			"Comment" : "Rocks"
		},
		{
			"CustomerId" : DBRef("Customer", 1002),
			"PostDate" : ISODate("2014-02-05T04:06:45Z"),
			"Comment" : "Sucks"
		}
	],
	"_id" : 1002
}
{
        "FirstName" : "Jane",
        "LastName" : "Smith",
        "_id" : 1001
}

When you run this pipeline:

db.coll.aggregate([{$unwind: '$Notes'}])

You get the following results:

{
	"FirstName" : "John",
	"LastName" : "Doe",
	"Notes" : 
		{
			"CustomerId" : DBRef("Customer", 1002),
			"PostDate" : ISODate("2014-02-05T00:04:45Z"),
			"Comment" : "Rocks"
		},
	"_id" : 1002
}
{
	"FirstName" : "John",
	"LastName" : "Doe",
	"Notes" :
		{
			"CustomerId" : DBRef("Customer", 1002),
			"PostDate" : ISODate("2014-02-05T04:06:45Z"),
			"Comment" : "Sucks"
		}
	],
	"_id" : 1002
}

It would be nice if it also returned the following result:

{   "_id" : 1001, 
    "FirstName" : "Jane", 
    "LastName" : "Smith" 
}

Here is user groups discussion on this topic
https://groups.google.com/forum/#!topic/mongodb-user/UKYPddRN37E

There is workaround exists using the $IFNULL, see the above link.



 Comments   
Comment by Asya Kamsky [ 24/Sep/15 ]

But if it is not an array, null, or not existing, then it still throws an error?

no, due to SERVER-8088 the pipeline allows $unwind to continue without any error even if field is not an array (starting in 3.1).

Comment by NOVALUE Mitar [ 24/Sep/15 ]

But if it is not an array, null, or not existing, then it still throws an error? OK.

Comment by Asya Kamsky [ 24/Sep/15 ]

Not a problem: here is how it works in 3.1.8 dev release:

test@local:27017(3.1.8) > db.ar.find()
{ "_id" : ObjectId("56032c2b58941a091122e8c2"), "a" : [ 1 ] }
{ "_id" : ObjectId("56032c2e58941a091122e8c3"), "a" : 1 }
{ "_id" : ObjectId("56032d5658941a091122e8c4"), "a" : null }
{ "_id" : ObjectId("56032d5b58941a091122e8c5"), "a" : [ ] }
{ "_id" : ObjectId("56032d5e58941a091122e8c6") }
test@local:27017(3.1.8) > db.ar.aggregate({$unwind:"$a"})
{ "_id" : ObjectId("56032c2b58941a091122e8c2"), "a" : 1 }
{ "_id" : ObjectId("56032c2e58941a091122e8c3"), "a" : 1 }

The above shows the old pre-3.2 behavior works exactly the same way.

If you use the new option with value false, it's exactly equivalent to not using the option at all, so we should see same result:

test@local:27017(3.1.8) > db.ar.aggregate({$unwind:{path:"$a",preserveNullAndEmptyArrays:false}})
{ "_id" : ObjectId("56032c2b58941a091122e8c2"), "a" : 1 }
{ "_id" : ObjectId("56032c2e58941a091122e8c3"), "a" : 1 }

If we change the value of the option to true we should get all the previously "dropped" documents from the unwind:

test@local:27017(3.1.8) > db.ar.aggregate({$unwind:{path:"$a",preserveNullAndEmptyArrays:true}})
{ "_id" : ObjectId("56032c2b58941a091122e8c2"), "a" : 1 }
{ "_id" : ObjectId("56032c2e58941a091122e8c3"), "a" : 1 }
{ "_id" : ObjectId("56032d5658941a091122e8c4"), "a" : null }
{ "_id" : ObjectId("56032d5b58941a091122e8c5"), "a" : [ ] }
{ "_id" : ObjectId("56032d5e58941a091122e8c6") }

You can see the three extra documents, in each "a" is the same as it was before unwind.

Comment by NOVALUE Mitar [ 24/Sep/15 ]

Sorry that it is hard to test runtime of the trunk code. I was basing my understanding on the documentation in the commit message.

Comment by Asya Kamsky [ 24/Sep/15 ]

Using this option, every document goes through the pipeline - none are lost during $unwind. Please try it before assuming it doesn't do what you want.

Comment by NOVALUE Mitar [ 24/Sep/15 ]

How does this address this ticket? This is not true. This addresses something, but not this ticket. This ticket was if the field does not exist, that document should be copied over. But you added feature for null and empty arrays? I think that for null I can maybe understand (it could be similar to when field does not exist), but if an array is empty, then no document should be constructed on the output, no?

So, please add a feature where for nonexisting (or null) field document is copied over. And how it behaves for empty array should be some other option, if at all. (Not sure who asked for that feature?)

So the discussion in the Google groups and also why I voted for this ticket was to support left outer join-like semantics. And unwinding an empty list is not how it works. Nonexisting value (or maybe null) is more reasonable.

(I would claim that only non-existing value documents should be copied over and null value fields should throw an error. And empty list fields should produce no output documents.)

Comment by Charlie Swanson [ 21/Sep/15 ]

This should be resolved as of 3cb6039e5d2968b1e361bf592452b95c7a76c770.

Generated at Thu Feb 08 03:29:16 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.