Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-77942

Performance regressions in $graphLookup due to makePipeline

    • Type: Icon: Bug Bug
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Query Execution
    • ALL

      Mongo-perf shows +15% in the LookupViaGraphLookup test between 6.0 and 7.0 and I believe it to be caused by various slowdowns under mongo::pipeline::makePipeline. Notably, 7.0 spends more time under mongo::getExecutor and dealing with AutoGetCollectionForReadCommandMaybeLockFree but, unfortunately, there is no single point of regression.

      https://jira.mongodb.org/browse/BF-28421 concerns some other Lookup related benchmarks but claims them to be not representative of real use cases. I don't think the same reasoning would apply to LookupViaGraphLookup as I can repro the regression on a collection that represents a binary tree via a parent link, trying to output direct children of each node – a typical scenario in hierarchical datasets.

      https://jira.mongodb.org/browse/BF-28050 is another ticket related to regressions in Lookup. It's not marked as 7.0 blocker (because the regression isn't severe anymore after SERVER-75853?) and doesn't look like it has been investigated beyond SERVER-75853.

      GraphLookup is sensitive to slowdowns under makePipeline() because it might call the method a lot: (per my observations, I haven't spend much time reading the implementation) as many times as there are matched unique values for the "connectFromField" in local collection plus as many documents in local that don't match to anything.

      Relevant tests in mongo-perf:

      Aggregation.LookupViaGraphLookup
      Aggregation.GraphLookupNeighbors
      Aggregation.IdentityView.LookupViaGraphLookup
      Aggregation.IdentityView.GraphLookupNeighbors
      Aggregation.GraphLookupSocialite
      Aggregation.IdentityView.GraphLookupSocialite

      https://evergreen.mongodb.com/task/sys_perf_6.0_linux_microbenchmarks_standalone_intel.2023_01_aggregation_read_commands_3b9ef88e5a613a7f5c62fb2d0c847482c6a5d85a_23_06_02_14_06_07

      https://evergreen.mongodb.com/task/sys_perf_7.0_linux_microbenchmarks_standalone_intel.2023_01_aggregation_read_commands_e54a7ebb88c7c922e9f817e7c23a4b16fcf288af_23_06_02_20_31_12

       

            Assignee:
            backlog-query-execution [DO NOT USE] Backlog - Query Execution
            Reporter:
            irina.yatsenko@mongodb.com Irina Yatsenko (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated: