[SERVER-75144] Investigate performance of $percentile expression Created: 22/Mar/23  Updated: 29/Oct/23  Resolved: 11/Apr/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 7.0.0-rc0

Type: Task Priority: Major - P3
Reporter: Irina Yatsenko (Inactive) Assignee: Irina Yatsenko (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Query Integration
Backwards Compatibility: Fully Compatible
Sprint: QI 2023-04-17
Participants:

 Description   

A naïve implementation of the expression that feeds one value at a time into an accumulator works but likely can be improved perf-wise. We should consider:

  1. sending the whole input at once rather than each value one by one
  2. not using t-digest for small inputs


 Comments   
Comment by Irina Yatsenko (Inactive) [ 11/Apr/23 ]

Switched $percentile expression to use DiscretePercentile with std::nth_element for computing a small set of percentiles or std::sort when >20 percentiles are requested. For a single percentile, we are now ~35% slower than $min (measured on array fields of length 100, 1000 and 10'000) 

For computing a single percentile over a field with a 1000-element array have:

    93.61%     6.54%  mongod              [.] mongo::ExpressionFromAccumulatorQuantile<mongo::AccumulatorPercentile>::evaluate
            |          
            |--87.08%--mongo::ExpressionFromAccumulatorQuantile<mongo::AccumulatorPercentile>::evaluate
            |          |          
            |          |--50.97%--mongo::ExpressionFieldPath::evaluate
            |          |          
            |          |--26.55%--mongo::DiscretePercentile::computePercentiles
            |          |          |          
            |          |           --26.50%--mongo::DiscretePercentile::computePercentile
            |          |                     |          
            |          |                      --26.43%--std::__introselect<...>
            |          |          
            |          |--4.74%--mongo::Value::coerceToDouble
            |          |          
            |           --4.71%--mongo::DiscretePercentile::incorporate
            |          
             --6.54%--start_thread
                       ...
                       mongo::runAggregate
                       mongo::PlanExecutorImpl::getNext
                       mongo::PlanExecutorImpl::_getNextImpl
                       mongo::PlanStage::work
                       mongo::ProjectionStage::doWork
                       mongo::ProjectionStageDefault::transform
                       mongo::projection_executor::ProjectionExecutor::applyTransformation
                       mongo::projection_executor::InclusionProjectionExecutor::applyProjection
                       mongo::projection_executor::ProjectionNode::applyToDocument
                       |          
                        --6.50%--mongo::projection_executor::ProjectionNode::applyExpressions
                                  mongo::ExpressionFromAccumulatorQuantile<mongo::AccumulatorPercentile>::evaluate

Comment by Githook User [ 11/Apr/23 ]

Author:

{'name': 'Irina Yatsenko', 'email': 'irina.yatsenko@mongodb.com', 'username': 'IrinaYatsenko'}

Message: SERVER-75144 Use discrete algorithm in $percentile expression on small input
Branch: master
https://github.com/mongodb/mongo/commit/37af16a5d6253e18474a4e6e50cb62b27dd99b85

Generated at Thu Feb 08 06:29:26 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.