-
Type:
New Feature
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
Query Optimization
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Inspired by some similar capability in ElasticSearch: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations.html#run-sub-aggs
Here's an example. Data set:
{_id: 1, party: "dem", state: "NY", metric: 10}
{_id: 2, party: "dem", state: "NY", metric: 14}
{_id: 3, party: "rep", state: "PA", metric: 16}
{_id: 4, party: "dem", state: "PA", metric: 12}
{_id: 5, party: "rep", state: "NY", metric: 8}
{_id: 6, party: "dem", state: "PA", metric: 9}
I could run this aggregation:
{
$group: {
_id: "$state",
stateTotalMetric: {$sum: "$metric"},
count: {$sum: 1},
byParty: {
$group: {
_id: "$party",
count: {$sum: 1},
totalMetric: {$sum: "$metric"}
}
}
}
}
And I would get this result:
{
_id: "NY",
stateTotalMetric: 32,
count: 3,
byParty: [
{_id: "dem", count: 2, totalMetric: 24},
{_id: "rep", count: 1, totalMetric: 8}
]
}
{
_id: "PA",
stateTotalMetric: 37,
count: 3,
byParty: [
{_id: "dem", count: 2, totalMetric: 21},
{_id: "rep", count: 1, totalMetric: 16}
]
}
This computation is obviously expressible today, but here's how you would have to do it:
{
$group: {
_id: {state: "$state", party: "$party},
count: {$sum: 1},
totalMetric: {$sum: "$metric"}
}
},
{
$group: {
_id: "$_id.state",
stateTotalMetric: {$sum: "$totalMetric"},
count: {$sum: "$count"},
byParty: {
$push: {
_id: "$_id.party",
count: "$count",
totalMetric: "$totalMetric"
}
}
}
}
Obviously, this is longer and more cumbersome. It also gets worse and more error prone if I had used $avg instead of $sum, where it would be incorrect or at least very different to take an average of averages, and I would have to track counts and sums independently in the first $group.
- is related to
-
SERVER-66701 Optimize $addToSet accumulators into $group stages where possible
-
- Open
-
-
SERVER-29930 allow nested $facet in another $facet of aggregate
-
- Backlog
-
- related to
-
SERVER-62405 Partially-streaming compound $sort
-
- Backlog
-