Details

      Description

      Would enable computation of things like the median value or the 99th percentile value.

      Original Description

      Now that the nuts and bolts of aggregation have been taken care of with the new aggregation framework, higher level functions like percentile and top, which we use extensively to report on metrics is highly desirable.
      Splunk lets me write custom operators and hook them into the query pipeline. Having something similar would be a great enhancement to the aggregation framework.

        Issue Links

          Activity

          Hide
          sturadnidge Stuart Radnidge added a comment -

          $pow and $log would be really useful to have, trying to use combinations of $multiply and $divide to do the same gets ugly fast!

          Show
          sturadnidge Stuart Radnidge added a comment - $pow and $log would be really useful to have, trying to use combinations of $multiply and $divide to do the same gets ugly fast!
          Hide
          charlie.swanson Charlie Swanson added a comment -

          Let's keep this ticket focused on $top and $percentile. Other additions can be addressed separately.

          In order to calculate $top or $percentile, we would need to hold on to all documents in the pipeline. This does not fit into the current streaming architecture.

          Show
          charlie.swanson Charlie Swanson added a comment - Let's keep this ticket focused on $top and $percentile. Other additions can be addressed separately. In order to calculate $top or $percentile, we would need to hold on to all documents in the pipeline. This does not fit into the current streaming architecture.
          Hide
          cstepnitz Christine S added a comment -

          One possible implementation approach for quantiles would be to use a TDigest, such as is documented and implemented in this simple library: https://github.com/tdunning/t-digest

          Show
          cstepnitz Christine S added a comment - One possible implementation approach for quantiles would be to use a TDigest, such as is documented and implemented in this simple library: https://github.com/tdunning/t-digest
          Hide
          charlie.swanson Charlie Swanson added a comment -

          I've restricted the scope of this ticket to be simply $percentile, as $top would be possible if we resolved SERVER-9377.

          Show
          charlie.swanson Charlie Swanson added a comment - I've restricted the scope of this ticket to be simply $percentile , as $top would be possible if we resolved SERVER-9377 .
          Hide
          real_ate Chris Manson added a comment -

          Has anyone discussed the potential implementation of this yet? I have a need for this operator and I have a few ideas how I might expect it to work, and if there hasn't been a design discussion about it I would like to start it here.

          Show
          real_ate Chris Manson added a comment - Has anyone discussed the potential implementation of this yet? I have a need for this operator and I have a few ideas how I might expect it to work, and if there hasn't been a design discussion about it I would like to start it here.
          Hide
          asya Asya Kamsky added a comment -

          Chris Manson I don't believe we have - feel free to add your ideas here.

          To make sure we are on the same page, it would help to include some specific examples.

          Show
          asya Asya Kamsky added a comment - Chris Manson I don't believe we have - feel free to add your ideas here. To make sure we are on the same page, it would help to include some specific examples.
          Hide
          atg@webperf.io Anh-tuan Gai added a comment -

          +1 for TDigest implementation to build a $percentile aggregation accumulator. Moreover, for incremental aggregation and merge of several TDigest (e.g. to compute the percentile on a long running period by aggregating multiple TDigest for sub periods), a TDigest serializer/deserializer would be great to store TDigest as Binary Data.

          Show
          atg@webperf.io Anh-tuan Gai added a comment - +1 for TDigest implementation to build a $percentile aggregation accumulator. Moreover, for incremental aggregation and merge of several TDigest (e.g. to compute the percentile on a long running period by aggregating multiple TDigest for sub periods), a TDigest serializer/deserializer would be great to store TDigest as Binary Data.

            People

            • Votes:
              24 Vote for this issue
              Watchers:
              23 Start watching this issue

              Dates

              • Created:
                Updated:
                Days since reply:
                16 weeks ago
                Date of 1st Reply: