Improve absl::raw_hash_set::find() performance on v5 toolchain

XMLWordPrintableJSON

    • Type: Improvement
    • Resolution: Cannot Reproduce
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Query Execution
    • None
    • Query Execution
    • Programmability 2025-05-26
    • 200
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      After switching to the toolchain v5 as default in 341832a we've observed few performance regressions including a 10% regression in ElemMatchLargeMixedInAndOrWithDuplicates causing BF-37502.

      The benchmark in question executes a simple find query with nested $elemMatch, $or, and $in predicates. Both $or and $in predicates have many arguments. The benchmark targets SBE engine (not yet enabled by default) which uses absl::raw_hash_set for $in predicate implementation.

      Our initial investigation identified >9% regression in absl::raw_hash_set::find(). It would be nice to 'recover' this performance on toolchain v5, however, we should consider the best way to do that since absl is a third-party library.

        1. bf-37502.js
          2 kB
          Romans Kasperovics
        2. diff_flamegraph2.svg
          106 kB
          Romans Kasperovics
        3. hash_set_bm.cpp
          10 kB
          Alex Li
        4. raw_hash_set_find_diff.svg
          488 kB
          Romans Kasperovics

            Assignee:
            Unassigned
            Reporter:
            Romans Kasperovics
            Votes:
            0 Vote for this issue
            Watchers:
            13 Start watching this issue

              Created:
              Updated:
              Resolved: