[QUERY_LIMITS] Slow estimation of $all with many arguments

XMLWordPrintableJSON

    • Type: Improvement
    • Resolution: Fixed
    • Priority: Critical - P2
    • 9.0.0-rc0
    • Affects Version/s: None
    • Component/s: None
    • Query Optimization
    • Fully Compatible
    • 200
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      A query that has $all: [...] with  many arguments takes long time to estimate, enough to trigger a slow query log event.

      Perf reports the following:

      +   99.49%     0.00%  conn9            mongod_with_debug  [.] mongo::(anonymous namespace)::PrepareExecutionHelper<mongo::PlanCacheKey, mongo::(anonymous namespace)::ClassicRuntimePlannerResult>::prepare()
      +   99.49%     0.00%  conn9            mongod_with_debug  [.] mongo::QueryPlanner::planWithCostBasedRanking(mongo::CanonicalQuery const&, mongo::QueryPlannerParams const&, mongo::ce::SamplingEstimator*, mong
      +   99.46%     0.00%  conn9            mongod_with_debug  [.] mongo::cost_based_ranker::CardinalityEstimator::estimate(mongo::QuerySolutionNode const*)
      +   99.46%     0.00%  conn9            mongod_with_debug  [.] mongo::cost_based_ranker::CardinalityEstimator::estimate(mongo::FetchNode const*)
      +   99.20%     0.00%  conn9            mongod_with_debug  [.] mongo::cost_based_ranker::CardinalityEstimator::estimate(mongo::FetchNode const*)::$_1::operator()() const
      +   98.46%     0.00%  conn9            mongod_with_debug  [.] mongo::ce::SamplingEstimatorImpl::estimateCardinality(mongo::MatchExpression const*) const
      +   98.41%     0.01%  conn9            mongod_with_debug  [.] mongo::exec::matcher::MatchExpressionEvaluator::visit(mongo::AndMatchExpression const*)
      +   98.39%    15.53%  conn9            mongod_with_debug  [.] mongo::exec::matcher::MatchExpressionEvaluator::visitPathExpression(mongo::PathMatchExpression const*)
      +   39.22%    29.10%  conn9            mongod_with_debug  [.] mongo::BSONElementIterator::more()
      +   38.04%    15.84%  conn9            mongod_with_debug  [.] mongo::exec::matcher::MatchesSingleElementEvaluator::visit(mongo::EqualityMatchExpression const*)
      +   22.21%    22.21%  conn9            mongod_with_debug  [.] mongo::BSONElement::compareElements(mongo::BSONElement const&, mongo::BSONElement const&, unsigned int, mongo::StringDataComparator const*)
      +   10.11%    10.11%  conn9            mongod_with_debug  [.] mongo::BSONElementIterator::subCursorHasMore()
      +    3.93%     3.93%  conn9            mongod_with_debug  [.] mongo::BSONElementIterator::next()
      +    1.64%     1.64%  conn9            mongod_with_debug  [.] mongo::EqualityMatchExpression::acceptVisitor(mongo::MatchExpressionVisitor<true>*) const
       

      The scenario is DatasetWideArrayIndex.WorkloadAll in jstests/product_limits/query_limits_test.js

       

      Problem

      $all: [a, b, c …] is syntactic sugar for $and with a, b , c as predicates → becomes an AndMatchExpression with each member of the list as an EqualityMatchExpression

      From docs page:

      { tags: \{ $all: [ "ssl" , "security" ] }

      } is equivalent to

      { $and: [ \{ tags: "ssl" }

      , { tags: "security" } ] }

       

      So for each of n EqualityMatchExpression we traverse the array (length m) at that field in the document

      Time complexity is O(n * m) per document in the sample

       

      Proposed solution

      Instead, detect this case and optimize by doing only 1 pass over the field’s array

      Sort the list of predicate values → O(nlogn)

      Sort the array in the document → O(mlogm)

      Iterate once over the array, checking predicates as we go → O(n + m)

       

      Time complexity of this approach is O(nlogn + mlogm + n + m) = O(nlogn + mlogm)

      Much better than n * m

            Assignee:
            Natalie Hill
            Reporter:
            Philip Stoev
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: