[SERVER-4437] aggregation: support windowing operation on pipelines Created: 06/Dec/11  Updated: 06/Dec/22  Resolved: 01/Jun/21

Status: Closed
Project: Core Server
Component/s: Aggregation Framework
Affects Version/s: None
Fix Version/s: None

Type: New Feature Priority: Major - P3
Reporter: Daniel Pasette (Inactive) Assignee: Backlog - Query Optimization
Resolution: Done Votes: 10
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
depends on SERVER-52328 Enable feature flag for Window Functions Closed
Related
related to SERVER-29161 Ability to access previous document i... Closed
related to SERVER-29339 allow using $reduce expression as acc... Backlog
is related to SERVER-447 new aggregation framework Closed
Assigned Teams:
Query Optimization
Participants:

 Description   

postgres supports a windowing capability that allows for calculations within a window of visible data; this is for streaming data.

It's easy to imagine supporting something like a $window pipeline operator, which specifies how many documents to include in the window. Within, aggregate expressions can reference documents within the window using some kind of indexing. This can be used to calculate things like moving averages, e.g., have a window of 5 items, and create a computed field that is (doc[0] + doc[-1] + doc[-2] + doc[-3] + doc[-4])/5, or something like that.



 Comments   
Comment by Asya Kamsky [ 06/Sep/17 ]

More current links to description of group by windowing/rollup/cube functions:

Comment by Chris Westin [ 14/Dec/11 ]

References from a postgres user:

As mentioned, SQL Windows Functions might be a useful design from which to draw further feature inspiration, as it is likely compatible with your "pipeline" approach. I am using these on PostgreSQL and find the best documentation there. There was [PGCon] presentation slides from 2009 giving a feature introduction and some implementation details, as well as the standard PostgreSQL docs: [PG0] intro and [PG1] functions.

The feature allows a PARTITION to be defined with similar options to GROUP BY, but then rather than collapsing to a single row for each group, preserves the original set of rows and allows access to functions of the PARTITION. For example, I make heavy use of the rank() function for a server side scoring algorithm.

[PGCon]: http://www.pgcon.org/2009/schedule/events/128.en.html
[PG0]: http://www.postgresql.org/docs/9.1/static/tutorial-window.html
[PG1]: http://www.postgresql.org/docs/9.1/static/functions-window.html

Generated at Thu Feb 08 03:05:59 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.