Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-81868

SBE $group implementation still scales poorly with number of accumulators

    XMLWordPrintableJSON

Details

    • Icon: Task Task
    • Resolution: Unresolved
    • Icon: Major - P3 Major - P3
    • None
    • None
    • None
    • None
    • Query Execution

    Description

      If we run a simple test with a $group by query with many accumulators, SBE performs worse than classic, and the gap appears to increase as the number of accumulators grows.

      For many queries the runtime is dominated by other work besides the accumulators (reading data, evaluating other expressions, etc). In these cases, the "regression" in time spent accumulating may not be visible at all. On the other hand, when running the accumulators is a large fraction of the query runtime, there is a clear difference.

      Currently the only way to see this issue is through queries with a large (20+) number of a accumulators. However, when running a time series $group query in SBE, we see similar behavior. This is because with time series, the amount of work done to read each document is relatively small, so the $group-by processing represents a greater fraction of the runtime.

      The issue appears to be most severe with $avg, presumably because the SBE implementation decomposes this into two separate accumulators (sum and count).

      Attachments

        Activity

          People

            backlog-query-execution Backlog - Query Execution
            ian.boros@mongodb.com Ian Boros
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated: