[SERVER-7463] $percentile aggregation accumulator Created: 24/Oct/12 Updated: 06/Dec/22 |
|
| Status: | Backlog |
| Project: | Core Server |
| Component/s: | Aggregation Framework |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | New Feature | Priority: | Major - P3 |
| Reporter: | Andre de Frere | Assignee: | Backlog - Query Execution |
| Resolution: | Unresolved | Votes: | 40 |
| Labels: | accumulator, expression, pull-request | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||
| Assigned Teams: |
Query Execution
|
||||||||||||
| Participants: | |||||||||||||
| Description |
| Comments |
| Comment by Anh-tuan Gai [ 03/May/16 ] |
|
+1 for TDigest implementation to build a $percentile aggregation accumulator. Moreover, for incremental aggregation and merge of several TDigest (e.g. to compute the percentile on a long running period by aggregating multiple TDigest for sub periods), a TDigest serializer/deserializer would be great to store TDigest as Binary Data. |
| Comment by Asya Kamsky [ 09/Feb/16 ] |
|
real_ate I don't believe we have - feel free to add your ideas here. To make sure we are on the same page, it would help to include some specific examples. |
| Comment by Chris Manson [ 09/Feb/16 ] |
|
Has anyone discussed the potential implementation of this yet? I have a need for this operator and I have a few ideas how I might expect it to work, and if there hasn't been a design discussion about it I would like to start it here. |
| Comment by Charlie Swanson [ 04/Feb/16 ] |
|
I've restricted the scope of this ticket to be simply $percentile, as $top would be possible if we resolved |
| Comment by Christine S [ 30/Jun/15 ] |
|
One possible implementation approach for quantiles would be to use a TDigest, such as is documented and implemented in this simple library: https://github.com/tdunning/t-digest |
| Comment by Charlie Swanson [ 20/May/15 ] |
|
Let's keep this ticket focused on $top and $percentile. Other additions can be addressed separately. In order to calculate $top or $percentile, we would need to hold on to all documents in the pipeline. This does not fit into the current streaming architecture. |
| Comment by Stuart Radnidge [ 13/Apr/13 ] |
|
$pow and $log would be really useful to have, trying to use combinations of $multiply and $divide to do the same gets ugly fast! |