Improve absl::raw_hash_set::find() performance on v5 toolchain

XMLWordPrintableJSON

    • Type: Improvement
    • Resolution: Cannot Reproduce
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Query Execution
    • None
    • Query Execution
    • Programmability 2025-05-26
    • 200
    • None
    • 3
    • TBD
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      After switching to the toolchain v5 as default in 341832a we've observed few performance regressions including a 10% regression in ElemMatchLargeMixedInAndOrWithDuplicates causing BF-37502.

      The benchmark in question executes a simple find query with nested $elemMatch, $or, and $in predicates. Both $or and $in predicates have many arguments. The benchmark targets SBE engine (not yet enabled by default) which uses absl::raw_hash_set for $in predicate implementation.

      Our initial investigation identified >9% regression in absl::raw_hash_set::find(). It would be nice to 'recover' this performance on toolchain v5, however, we should consider the best way to do that since absl is a third-party library.

        1. bf-37502.js
          2 kB
        2. diff_flamegraph2.svg
          106 kB
        3. hash_set_bm.cpp
          10 kB
        4. raw_hash_set_find_diff.svg
          488 kB

              Assignee:
              Unassigned
              Reporter:
              Romans Kasperovics
              Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

                Created:
                Updated:
                Resolved: