[SERVER-6857] Add Aggregation Framework expression to average dates Created: 26/Aug/12  Updated: 06/Dec/22

Status: Backlog
Project: Core Server
Component/s: Aggregation Framework
Affects Version/s: 2.2.0-rc0, 2.2.0-rc1
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Joe Davis Assignee: Backlog - Query Execution
Resolution: Unresolved Votes: 2
Labels: expression, grab-bag, usability
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Linux, Mac OSX


Assigned Teams:
Query Execution
Participants:

 Description   

Add an accumulator/expression that takes dates and returns their average (could be called $avgDate).

Original description:

When using aggregation framework $avg on an ISODate in mongo shell I get the following error message:

{
	"errmsg" : "exception: can't convert from BSON type 2 to double",
	"code" : 16005,
	"ok" : 0
}

Here's my setup. Collection data:

> db.sys_user.find({}, {sys_created_on:1}).limit(3)
{ "_id" : ObjectId("502a95d4c2e6f4d8ffbbef21"), "sys_created_on" : ISODate("2012-02-18T03:04:49Z") }
{ "_id" : ObjectId("502a95d4c2e6f4d8ffbbef22"), "sys_created_on" : ISODate("2012-02-18T03:04:49Z") }
{ "_id" : ObjectId("502a95d4c2e6f4d8ffbbef23"), "sys_created_on" : ISODate("2012-02-18T03:04:49Z") }

This errors:

> db.sys_user.aggregate([ { $group: {_id:null, max:{$max:"$sys_created_on"}, min:{$min:"$sys_created_on"}, avg:{$avg:"sys_created_on"}} }, { "$project" : { "_id" : 0 , "max": 1, "min": 1, "avg": 1}} ])
{
	"errmsg" : "exception: can't convert from BSON type 2 to double",
	"code" : 16005,
	"ok" : 0
}

$min, $max work fine:

> db.sys_user.aggregate([ { $group: {_id:null, max:{$max:"$sys_created_on"}, min:{$min:"$sys_created_on"}} }, { "$project" : { "_id" : 0 , "max": 1, "min": 1}} ])
{
	"result" : [
		{
			"max" : ISODate("2012-03-21T05:13:21Z"),
			"min" : ISODate("2004-05-02T00:00:00Z")
		}
	],
	"ok" : 1
}



 Comments   
Comment by Charlie Swanson [ 10/Apr/18 ]

Oh I see now. Can we instead convert this into a request for a new accumulator/expression called $avgDate or something like that? I don't see us implementing it any time soon, but it does seem like a valid improvement request.

Comment by Asya Kamsky [ 10/Apr/18 ]

It would be backwards breaking because currently taking average of number X and any number of dates returns result X. Once we start allowing averages of dates, we would have to decide what to do in mixed type accumulation, and result we would return to this operation would likely change.

Comment by Charlie Swanson [ 09/Apr/18 ]

Can you elaborate on why making the proposed change would be backwards breaking? Just because it returns null today and would return something non-null after this? While I agree there is technically a way to do this, the user's experience would be pretty sad. It would pretty much require a google search to figure out how to do it.

This still seems like a worthwhile improvement, and if a user wants to generate null after grouping some dates together, there are plenty of other ways to do that.

Comment by Asya Kamsky [ 07/Apr/18 ]

Here is result with workaround provided in previous comment:

{ "max" : ISODate("2012-02-18T03:04:49Z"), "min" : ISODate("2004-05-02T00:00:00Z"), "avg" : ISODate("2010-03-08T02:18:36.750Z") }

Here is the same result on 3.7.4 with new type conversion functionality:

db.sys_user.aggregate([ { $group: {_id:null, max:{$max:"$sys_created_on"}, min:{$min:"$sys_created_on"}, avg:{$avg:{$toLong:"$sys_created_on"}}} }, { "$project" : { "_id" : 0 , "max": 1, "min": 1, "avg": {$toDate:"$avg"}}} ])
{ "max" : ISODate("2012-02-18T03:04:49Z"), "min" : ISODate("2004-05-02T00:00:00Z"), "avg" : ISODate("2010-03-08T02:18:36.750Z") }

Given making the proposed change would be somewhat backwards breaking, I propose closing this ticket as won't fix since there are existing ways to achieve the same result.

david.storch charlie.swanson any objections?

Comment by Asya Kamsky [ 19/Dec/17 ]

Since we have allowed date arithmetic for a long time, it's been possible to do this calculation in aggregation like this:

let dt=new Date();   /* any date will do */
db.sys_user.aggregate([
     {$group:{
        _id:null, 
          max:{$max:"$sys_created_on"}, 
          min:{$min:"$sys_created_on"}, 
          avg:{$avg:{$subtract:[dt, "$sys_created_on"]}}
      }}, 
      { "$project" : { "_id" : 0 , "max": 1, "min": 1, "avg": {$subtract:[dt, "$avg"]}}} 
])

Comment by Asya Kamsky [ 19/Dec/17 ]

Just tested on 3.6.0 - $avg does not give an error on ISODate type but rather returns null in this example.

Comment by Joe Davis [ 29/Aug/12 ]

Yes, arithmetic mean is what I was expecting. Use case is an adhoc reporting capability that allows users/developers to choose aggregation functions (eg MIN, MAX, AVG, COUNT) and apply them to any field.

MySQL allows SELECT AVG(datefield) FROM testTable, it's an arithmetic mean: given dates [2012,2005] AVG() would return June 2008.

Alternately, if we could convert dates to/from numbers we could SUM(), COUNT(), and re-cast to a date on our own.

Comment by Mathias Stearn [ 27/Aug/12 ]

Before implementing this, I want to be sure that simple arithmetic mean is the correct operation for dates. Could you give me your use case and what you hope to discover using that query?

Generated at Thu Feb 08 03:12:53 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.