[SERVER-55057] Improvements to ComparisonMatchExpressions Created: 09/Mar/21  Updated: 29/Oct/23  Resolved: 23/Aug/21

Status: Closed
Project: Core Server
Component/s: Performance, Querying
Affects Version/s: None
Fix Version/s: 5.1.0-rc0

Type: Improvement Priority: Major - P3
Reporter: Mathias Stearn Assignee: Bikash Chandra (Inactive)
Resolution: Fixed Votes: 0
Labels: neweng
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Text File matcher.patch    
Backwards Compatibility: Fully Compatible
Sprint: QE 2021-08-23, QE 2021-09-06
Participants:

 Description   

There are some low-hanging-fruit optimizations available to ComparisonMatchExpression:

  • Call canonicalType() less
    • SERVER-55043 will help a lot, but testing shows it is still faster to avoid canonicalizing where possible.
    • You only need to enter the first block if the type() doesn't match since same type implies same canonical type. This optimization is already done in woCompare, so this is about doing the same thing in CME.
    • Inside the if block it immediately canonicalizes again.
  • Optimize NaN checks
    • The special NaN logic isn't needed for EQ because it is already done correctly and efficiently in that case by BSONElement::compareElements.
    • It will do the NaN checks even for non-numeric types, such as strings. The logic ends up being correct, but it is wasteful.
    • It recomputes whether _rhs is NaN every time even though _rhs doesn't change.
    • It checks whether Int and Long are NaN, even though they don't have a NaN representation
    • It does an expensive Decimal -> double conversion just to check for nan, rather than checking for NaN while still a Decimal.
    • It then repeats the NaN check for both _rhs and e if either is NaN to check if both are NaN.
  • Do an early-out check when doing EQ comparisons with String to see if the lengths don't match before comparing contents.
    • woCompare/compareElements can't do that because it doesn't know if the caller is interested in just equality or relative order, but we know that here.
    • We can only do that with the simple collator, but that seems like a case worth optimizing for.

I'm attaching a patch which addresses all of these. I saw a huge improvement in perf with this patch vs without.

I also tried templating ComparisonMatchExpression::matchesSingleElement as matchesSingleElementImpl<MatchType> and calling it from each of the subclasses' matchesSingleElement() so that all of the branching on matchType() would compile away, but I didn't see any improvement in my workload, so I didn't include it in this patch. It is possible that workloads involving more than one MatchType would see some benefit, so it may be worth exploring in the future.



 Comments   
Comment by Vivian Ge (Inactive) [ 06/Oct/21 ]

Updating the fixversion since branching activities occurred yesterday. This ticket will be in rc0 when it’s been triggered. For more active release information, please keep an eye on #server-release. Thank you!

Comment by Githook User [ 23/Aug/21 ]

Author:

{'name': 'Bikash Chandra', 'email': 'bikash.chandra@Bikashs-MacBook-Pro.local'}

Message: SERVER-55057 Improvements to ComparisonMatchExpressions
Branch: master
https://github.com/mongodb/mongo/commit/1c458cebc23d20bf259d3774498b5080b7565ca1

Comment by Rushan Chen [ 22/Jul/21 ]

For Bikash.

Generated at Thu Feb 08 05:35:20 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.