[SERVER-16284] Allow for $pushall / $push with $each to allow pushing all elements of array in $group Created: 22/Nov/14  Updated: 31/Oct/23

Status: Open
Project: Core Server
Component/s: Aggregation Framework
Affects Version/s: None
Fix Version/s: None

Type: New Feature Priority: Major - P3
Reporter: John Butler Assignee: Backlog - Query Optimization
Resolution: Unresolved Votes: 20
Labels: accumulator, aggregate, asya, expression, needs-scope-approval, needs-syntax, pm1457-nominee
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
is duplicated by SERVER-23839 Aggregation accumulators does not all... Closed
is duplicated by SERVER-14340 Option for $addToSet to work on array... Closed
is duplicated by SERVER-27484 allow $setUnion in $group Closed
Gantt Dependency
Related
related to SERVER-29339 allow using $reduce expression as acc... Backlog
Assigned Teams:
Query Optimization
Participants:

 Description   

I would like to be able to push all elements of an array field into a new field. So, in a $group, I could do a $push or $addToSet of a field that is an array field and the resulting field would be an array of the elements instead of an array of arrays.

Seems like the $each operator in update is the solution if added to aggregations.



 Comments   
Comment by Asya Kamsky [ 11/Aug/21 ]

Ran into this again here: https://www.mongodb.com/community/forums/t/group-multi-dimensional-array-after-unwinding-elements/117554/3 where the solution required flattening arrays of arrays because there was no way to $pushAll.

 

Comment by Asya Kamsky [ 26/Jul/17 ]

SERVER-29339 would allow this, and any other arbitrary grouping but I would agree that the more common use cases should have simplest possible syntax.

Comment by Marc Tinkler [ 26/Jul/17 ]

Just my 2 cents, but if you are going to be adding accumulators like $mergeObjects (SERVER-24879) you really should consider adding support for set union when grouping.

Comment by David Storch [ 23/Jan/17 ]

As noted in SERVER-27484, allowing a set union operation in $group would be useful, in addition to array concatenation. Both are in scope for this ticket. In other words, given the input documents for a group

{_id: "myGroup", arr: [1, 2, 3]}
{_id: "myGroup", arr: [2, 3, 4]}

there should be an easy way to reduce this to either {_id: "myGroup", resultArr: [1, 2, 3, 4]} or {_id: "myGroup", resultArr: [1, 2, 3, 2, 3, 4]}.

Comment by Neil Lunn [ 24/Apr/16 ]

I would actually see this more in this form:

    db.foo.drop()
    db.foo.insertMany([
        { "x": 1, "a": [1,2] },
        { "x": 1, "a": [2,3] }
    ])

Where accumulators functioned something like the following and without needing to resort to $unwind

 
    db.foo.aggregate([
       { "$group": {
           "_id": "$x",
          "a": { "$pushEach": "$a" }
      }}
    ])
   // Results in 
   { "_id": 1, "a": [1,2,2,3] }
 
    db.foo.aggregate([
       { "$group": {
           "_id": "$x",
          "a": { "$addEach": "$a" }
      }}
    ])
   // Results in 
   { "_id": 1, "a": [1,2,3] }

Or even as an allowed "modifier" from a syntax point of view

 
    db.foo.aggregate([
       { "$group": {
           "_id": "$x",
          "a": { "$addToSet": { "$each": "$a" } }
      }}
    ])
   // Results in 
   { "_id": 1, "a": [1,2,3] }
 
   // Or with expressions
   db.foo.drop()
   db.foo.insert({ "a": 1, "b": 2 })
 
    db.foo.aggregate([
       { "$group": {
         "_id": null,
         "data": {
           "$push": {
             "$each": {
                 "$setUnion": [
                   [{ "a": "$a", "b"; "$b" }],
                   [{ "a": "$b", "b": "$a" }]
                 ]
             }
           }
         }
       }}
    ])
   // Results in
   { "_id": null, "data": [{ "a": 2, "b": 1 }, { "a": 1, "b": 2 }] }

Point being that there should be an operator syntax for applying each element of an existing array or "set" without needing to resort to $unwind.

The $unwind operation should be reserved for cases where it is necessary to actually transfer elements from those arrays into part of the grouping key itself.

Allowing such modifiers and even $sort as well removes various use cases requiring $unwind and should significantly improve performance where the array/"set" can be merged/sorted within the accumulator result.

Comment by Charlie Swanson [ 05/Nov/15 ]

I should add to my example that $slice is new in 3.2, so this won't work against a mongod <=3.0, but the point remains, any expression can be used in the group stage, just substitute $slice for your favorite projection operator.

Comment by Charlie Swanson [ 04/Nov/15 ]

Hi oeberle@cellmobs.com,

That ticket is slightly misnamed. You can use the $slice operator in a $group stage, just as you can use any other operator available in the project stage. e.g. given the following documents:

{_id: 0, comments: [{text: 'hello there'}, {text: 'hi!'}]}
{_id: 1, comments: [{text: 'hello again!'}, {text: 'yo'}]}

You can run the following pipeline

db.foo.aggregate([{
  $group: {
    _id: null,
    comments: {
      $push: {
        $slice: ['$comments', 1]
      }
    }
  }
}])

To get this result

{_id: null, comments: [{text: 'hello there'}, {text: 'hello again!'}]}

Comment by Oliver Eberle [ 28/Jul/15 ]

It would be nice if $push in aggregation could support $each and @slice etc.

Having the option to limit array items with slice in the $project stage is now supported (SERVER-6074) but it would be good to have these options as part of the @group stage as well.

Comment by John Butler [ 26/Nov/14 ]

So yes, I am familiar with unwind but here is how it ended up. In my case I would like to do a two-step group. Group 1 produces a count and creates new arrays of values. Group 2 gives me an overall count and concatenates the two arrays (via the $pushall or $push / $each to produce an array of values instead of an array or arrays).

Instead of doing the aggregation as two simple groups, I have to do the following:

group1
unwind array1
unwind array2
group2 (this group is limited in that I cannot get a count of unique values because of the double unwind)
project to get a count that could have been done in group2 if I had not had to unwind twice

So yes, I was able to do a workaround. The point however is that this seems to be a functionality ($each I believe) that would not be uncommonly used if available in aggregate and seems to be available for update. I ask that you make it available in aggregate as well.

Comment by Ramon Fernandez Marina [ 25/Nov/14 ]

You may want to try using the $unwind operator to achieve what you need; for example:

> db.foo.drop()
> db.foo.insert({x:1, a:[1,2]})
> db.foo.insert({x:1, a:[3,4]})
// $push creates an array of arrays
> db.foo.aggregate([{$group: {_id:'$x', res:{$push:"$a"}}}])
{ "_id" : 1, "res" : [ [ 1, 2 ], [ 3, 4 ] ] }
// using $unwind before $group does the trick
> db.foo.aggregate([{$unwind:'$a'}, {$group: {_id:'$x', res:{$push:"$a"}}}])
{ "_id" : 1, "res" : [ 1, 2, 3, 4 ] }

Is this the behavior you're looking for?

Generated at Thu Feb 08 03:40:34 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.