[SERVER-30270] $sum should be null when no fields/non numeric are found Created: 21/Jul/17  Updated: 06/Dec/22  Resolved: 11/Aug/17

Status: Closed
Project: Core Server
Component/s: Aggregation Framework
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Minor - P4
Reporter: Ben Rotz Assignee: Backlog - Query Team (Inactive)
Resolution: Won't Fix Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-14691 $avg aggregation operator should retu... Closed
Assigned Teams:
Query
Participants:

 Description   

I am the original reporter that led to https://jira.mongodb.org/browse/DOCS-8996

I believe that $sum should work this way as well.

For example:

db.some_collection.aggregate([{$group: {
_id: null,
average: {$avg: '$bad_field'},
sum: {$sum: '$bad_field'},
}}]);

will return null for average, and 0 for sum. It should return null for sum.

I'm a little surprised $sum was not updated when $avg was, and feel that this was an oversight.

Thanks

Ben



 Comments   
Comment by Ben Rotz [ 24/Jul/17 ]

Hm I could see what you are saying is true. I suppose a somewhat related operation in javascript is

var test = null + null;

or even

var test = +null;

results in 0 due to + casting

OK. Thanks for the explanation. It's obviously easier for my use case to just have the 'sum' value be null in this case without having to read another field to determine that, but I can understand that semantics are very important so as to not wind up with edge cases all over.

Comment by Charlie Swanson [ 24/Jul/17 ]

I think this was an intentional decision, since the mathematical definition of the sum of 0 things is 0. There is no definition of the average of 0 things, so we had to go with null there. If you want to know if it was 0 because it totaled to 0 or if it was because nothing was non-null, you could add an additional field to the output like so:

db.foo.aggregate([
{$group: {
  _id: "$groupIdField",
  sum: {$sum: "$value"},
  nonNullCount: {$sum: {$cond: {
    if: {$in: [{$type: "$value"}, ["null", "missing", "undefined"]]},
    then: 0,
    else: 1
  }}
}}
])

Comment by Ben Rotz [ 22/Jul/17 ]

The use case here is the same as the $avg use case, being able to tell differences between actual found values of 0 vs null or absent fields. In turn, these values could also be used in averages, and so the reasoning for changing $avg to be null for nullish fields also applies here as well.

I understand and agree about the minor backwards breaking change. I think there are simple work arounds for this desired behavior:

either using $ifNull in a project to 0 (e.g. 'value' :

{ $ifNull: ['$value', 0] }

) , or adding an actual literal 0 to project $sums (e.g. 'total':

{ $sum: ['$bad_field', 0] }

)

Thank you for your time and consideration.

Comment by Asya Kamsky [ 22/Jul/17 ]

Changing $sum of nullish values to be null rather than 0 would be a minor backwards breaking change. We should consider how the current behavior could be considered incorrect (for instance, not being able to tell apart sum that's actually 0 as opposed to all null or missing is potentially an issue).

The query team will triage this in the next planning meeting.

Generated at Thu Feb 08 04:23:13 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.