[SERVER-7805] Unable to access _id subdocuments in $group Created: 29/Nov/12  Updated: 04/Feb/15  Resolved: 30/Nov/12

Status: Closed
Project: Core Server
Component/s: Aggregation Framework
Affects Version/s: 2.2.1
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Christian Csar Assignee: Mathias Stearn
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

amd64 Amazon Linux AMI 2012.09 with amd64 mongo installed from 10gen yum repository.


Issue Links:
Depends
Duplicate
duplicates SERVER-7491 Can't use subfields of composite _id ... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Steps To Reproduce:

Create a collection that has a compound _id field. For example {_id:

{a: 1, b:2}

, c :3} then do an aggregation with a $group where _id references _id.a. For example {"$group" : {_id : "$_id.a" , total:

{"$sum" : "$c" }

}}. In this case _id will be end up as null as _id.a is ignored.

Participants:

 Description   

It appears that if one is using the Aggregation Framework on a collection where _id is an object with subfields, that $group will ignore anything that is a subfield of _id when specifying it's own _id. The main case where this will occur is in using the aggregation framework on a collection that is the output of a map reduce. It is possible to work around this by using a $project stage to rename the fields from _id.Z to Z for instance. However this presumably takes extra resources. There is also no error message which is likely to cause frustration to the developer or user.

For example the following two aggregations should in principle be identical with the project phase in the second being superfluous. The projection of value.I to I and value.U to U appear to be unnecessary as the problem seems specific to _id.
mongos> db.hourAggregate.aggregate( { $match : { "value.U" : 158, "_id.t" :"I"}} , {$group : {_id :

{Z: "$_id.Z", U : "$value.U" }

, count : {$sum: "$value.I"} }})
{
"result" : [
{
"_id" :

{ "U" : 158 }

,
"count" : NumberLong(1615478)
}
],
"ok" : 1

}
mongos> db.hourAggregate.aggregate( { $match : { "value.U" : 158, "_id.t" :"I"}} , {$project : { Z:"$_id.Z", U:"$value.U", I : "$value.I"}} ,{$group : {_id :

{Z: "$Z", U : "$U" }

, count : {$sum: "$I"} }})
{
"result" : [
{
"_id" :

{ "Z" : 137, "U" : 158 }

,
"count" : NumberLong(541555)
},
{
"_id" :

{ "Z" : 138, "U" : 158 }

,
"count" : NumberLong(470692)
},
{
"_id" :

{ "Z" : 139, "U" : 158 }

,
"count" : NumberLong(603231)
}
],
"ok" : 1

See https://groups.google.com/d/topic/mongodb-user/Yw3fvn7udY4/discussion



 Comments   
Comment by Mathias Stearn [ 30/Nov/12 ]

This has already been fixed in 2.2.2 and 2.3.1.

Comment by Mathias Stearn [ 30/Nov/12 ]

I can confirm that this fails in 2.2.0, but this is already working correct in git master for me:

2.2.0:

> db.stuff.insert({_id: {a: 1, b:2}, c :3})
> db.stuff.aggregate( {"$group" : {_id : "$_id.a" , total: {"$sum" : "$c" } }} )

My latest build:

> db.stuff.aggregate( {"$group" : {_id : "$_id.a" , total: {"$sum" : "$c" } }} )
{ "result" : [ { "_id" : 1, "total" : 3 } ], "ok" : 1 }
> db.stuff.aggregate( {"$group" : {_id : {a:"$_id.a"} , total: {"$sum" : "$c" } }} )
{
        "result" : [
                {
                        "_id" : {
                                "a" : 1
                        },
                        "total" : 3
                }
        ],
        "ok" : 1
}

I'm not sure which of the many changes I've made for the 2.3 series fixed this bug. I'm going to try to see if this is easy to fix for 2.2.x, but that may not be possible.

Generated at Thu Feb 08 03:15:39 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.