Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-81868

SBE $group implementation still scales poorly with number of accumulators

    • Type: Icon: Task Task
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Query Execution

      If we run a simple test with a $group by query with many accumulators, SBE performs worse than classic, and the gap appears to increase as the number of accumulators grows.

      For many queries the runtime is dominated by other work besides the accumulators (reading data, evaluating other expressions, etc). In these cases, the "regression" in time spent accumulating may not be visible at all. On the other hand, when running the accumulators is a large fraction of the query runtime, there is a clear difference.

      Currently the only way to see this issue is through queries with a large (20+) number of a accumulators. However, when running a time series $group query in SBE, we see similar behavior. This is because with time series, the amount of work done to read each document is relatively small, so the $group-by processing represents a greater fraction of the runtime.

      The issue appears to be most severe with $avg, presumably because the SBE implementation decomposes this into two separate accumulators (sum and count).

            Assignee:
            backlog-query-execution [DO NOT USE] Backlog - Query Execution
            Reporter:
            ian.boros@mongodb.com Ian Boros
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated: