[SERVER-8088] $unwind of non-array should be allowed Created: 06/Jan/13  Updated: 15/May/18  Resolved: 11/Mar/15

Status: Closed
Project: Core Server
Component/s: Aggregation Framework
Affects Version/s: None
Fix Version/s: 3.1.0

Type: Bug Priority: Major - P3
Reporter: Scott Hernandez (Inactive) Assignee: Eliot Horowitz (Inactive)
Resolution: Done Votes: 11
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Documented
is documented by DOCS-9024 $unwind of non-array should be allowed Closed
Duplicate
is duplicated by SERVER-11718 Allow aggregation framework to option... Closed
Related
is related to SERVER-12685 Expand $unwind behavior to include em... Closed
Backwards Compatibility: Minor Change
Operating System: ALL
Participants:
Case:

 Description   

This would effectively just output the existing field.

> db.reg.find()
{ "_id" : 1, "text" : "foo" }
{ "_id" : 2, "text" : "bar" }
{ "_id" : 3, "text" : "Bar" }
{ "_id" : 4, "text" : [ "bar", "foo" ] }
>db.reg.aggregate({$match:{text:/ba/i}}, {$unwind:"$text"})
...
	"errmsg" : "exception: $unwind:  value at end of field path must be an array",
	"code" : 15978,
	"ok" : 0

This will be important for collections where some document fields may be arrays, but some may not.



 Comments   
Comment by Githook User [ 11/Mar/15 ]

Author:

{u'username': u'erh', u'name': u'Eliot Horowitz', u'email': u'eliot@10gen.com'}

Message: SERVER-8088: $unwind of scalar should return 1 doc with scalar
Branch: master
https://github.com/mongodb/mongo/commit/f4d17dd81431f9724006c0837ccac44068971b1d

Comment by Laurent Dollé [ 10/Mar/15 ]

dan@10gen.com,
Are the "other parts of the system that can error" (e.g., $push - cf. Asya's comment) also impacted by this new behaviour or are we only dealing here with unwind?
Thanks anyway for increasing the severity and considering a quick fix.

– Laurent

Comment by Daniel Pasette (Inactive) [ 10/Mar/15 ]

null field, empty array, missing field should all currently emit no document, which is correct.
We will change agg to emit a single document containing the scalar value as requested.

Comment by Asya Kamsky [ 01/Aug/14 ]

ldolle@amadeus.com

One possible workaround would be to catch the exception when a non-array is attempted to be unwound and then rewrite the query programmatically based on the error.

I do want to point out that aggregation is not the only part of the system that can error when array vs. non-arrays are inconsistent through the collection - any array operation like "$push", etc. will also give an error if the targeted field is not of the expected type, so this is not something introduced by aggregation framework.

Possible upstream workaround would involve normalizing all fields which are allowed to be arrays to always be arrays only - but since I don't know where the data comes from I cannot say whether that's feasible.

Asya

Comment by Scott Frenkiel [ 30/Jul/14 ]

Small world..I enjoyed the talk also. Had never heard of mtools, but it's been helpful for us

We'll look into this workaround.

Comment by Laurent Dollé [ 30/Jul/14 ]

Yes, workaround on applicative side is definitely identified. Problem is implementation, which would be even uglier, as we have a lot of "optional" arrays, which could themselves contain other optional arrays, which could... etc.
As our application is a query generator which can handle any unpredictable user input (read: which can $unwind pretty much any array), we would end up with a complex workaround needing the schema to be hard-coded just to be able to eventually recreate the missing optional array, and its parent missing optional array, and its... etc.

Definitely possible, but with a boilerplate effect. Thanks anyway for the tip, we'll eventually give it a try if this ticket does not move much.

Do you know if investigating the server code and trying to implement that toggle on our side would be something feasible?


Totally unrelated, but I have to take the opportunity: Asya, I really liked your "Sherlock Holmes"/investigation talk at MongoDB World last month.
Nice work... and entertaining too!

Comment by Asya Kamsky [ 29/Jul/14 ]

There is a work-around (ugly one) for determining whether a field is an array or not in the pipeline - you can add a $project which will turn the field into an array if that field is not an array.

I describe a related hack to determine type here: http://www.kamsky.org/stupid-tricks-with-mongodb/is-it-an-array

Instead of using it for $size, if it's not an array you can use an almost equally ugly trick (which will be slower, but at least it's something that can functionally work until the feature is implemented):

// assume "a" is the field that may or may not contain an array
db.collection.aggregate(
{$project:{a:1, isArray:{"$cond": {"if" : {"$eq":["$a.0",[]]}, "then":true, "else":false} } }}, 
{$group:{_id:"$_id", a:{$first:"$a"}, aa:{$addToSet:"$a"}, isArray:{$first:"$isArray"}}},
{$project:{a:{$cond:{if:"$isArray",then:"$a",else:"$aa"}}}})

Now it's safe to unwind. If you had any documents that had no field a then it'll be an empty array here and that will lose the document during unwind - there is a trick to rectify that as well by testing if field is equal to [ ] and then putting [ null ] or some such in there (which can be done in the else of the second $project above).

Comment by Laurent Dollé [ 24/Jul/14 ]

Any update on Jon's comment (25Apr14)?
If discarding-document-when-unwinded-field-is-empty-or-missing is a pure implementation choice, we'd like to be able to switch this feature off.

Comment by Jon Rangel (Inactive) [ 25/Apr/14 ]

Can the scope of this ticket be extended to include support for passing though documents that contain an empty array or non-existent field? There would need to be an option to toggle the behaviour of $unwind.

There are use cases involving $unwind followed by $group on _id where you want to aggregate array elements but don't want to filter docs from the output simply because they didn't contain any array elements.

Comment by Scott Frenkiel [ 09/Dec/13 ]

In my situation, I allow my client apps to extend my stored docs with their own properties, but because of this behavior at query time I am forced to make them tell me if the property is arrayed or not. So this would definitely be a help to me in simplifying my interface.

Comment by Scott Hernandez (Inactive) [ 25/Oct/13 ]

While the $push behavior is different one is about reading/reporting while the other is about writing. During the transition from a single element, to array, it would be nice for this to work on reporting side, using aggregation; it also matches how queries/indexes work conceptually.

Comment by Mathias Stearn [ 07/Oct/13 ]

I think current behaviour is correct. This mimics $push which errors on non-missing/non-array fields.

Comment by Eduardo Manso [ 24/Feb/13 ]

I agree with you ... I've also tried to do that way and got the same error.

Generated at Thu Feb 08 03:16:30 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.