Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-60893

Deduplicate field lookups for a same field path when $group is pushed down to SBE

    • Type: Icon: Task Task
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 5.2.0
    • Affects Version/s: None
    • Component/s: None
    • Labels:
      None
    • Fully Compatible
    • QE 2021-11-01

      As of now, the $group SBE stage builder generates multiple field lookup expressions and/or stages for a same field path. It may hurt performance when there are multiple accumulators and they access the same field again and again and this perf degradation becomes much worse when the field paths are non top-level ones than when they are top-level ones.

      This is an example of such case:

      MongoDB Enterprise > db.t.explain().aggregate([{$group: {_id: "$item", min: {$min: "$price.a"}, max: {$max: "$price.a"}, f: {$first: "$price.a"}}}]).queryPlanner.winningPlan.slotBasedPlan.stages
      [3] mkbson s27 [_id = s12, min = s25, max = s26, f = s24] true false
      [3] project [s25 = fillEmpty (s16, null), s26 = fillEmpty (s20, null)]
      [3] group [s12] [s16 = min (let [l1.0 = s15] if (! exists (l1.0) || typeMatch (l1.0, 0x00000440), Nothing, l1.0)), s20 = max (let [l2.0 = s19] if (! exists (l2.0) || typeMatch (l2.0, 0x00000440), Nothing, l2.0)), s24 = first (fillEmpty (s23, null))]
      [3] traverse s23 s22 s21 [s10, s11, s12, s13, s15, s17, s19] {} {}
      from
          [3] project [s21 = getField (s10, "price")]
          [3] traverse s19 s18 s17 [s10, s11, s12, s13, s15] {} {}
          from
              [3] project [s17 = getField (s10, "price")]
              [3] traverse s15 s14 s13 [s10, s11, s12] {} {}
              from
                  [3] project [s13 = getField (s10, "price")]
                  [3] project [s12 = fillEmpty (s11, null)]
                  [3] project [s11 = getField (s10, "item")]
                  [2] traverse s10 s9 s4 [s5] {} {}
                  from
                      [1] scan s4 s5 none none none none [] @"7fb2bafd-3565-4723-8153-e775e578c8c9" true false
                  in
                      [2] cfilter {isObject (s4)}
                      [2] mkbson s9 s4 [item] keep [price = s8] true false
                      [2] traverse s8 s7 s6 {} {}
                      from
                          [2] project [s6 = getField (s4, "price")]
                          [2] limit 1
                          [2] coscan
                      in
                          [2] cfilter {isObject (s6)}
                          [2] mkbson s7 s6 [a] keep [] true false
                          [2] limit 1
                          [2] coscan
              in
                  [3] project [s14 = getField (s13, "a")]
                  [3] limit 1
                  [3] coscan
          in
              [3] project [s18 = getField (s17, "a")]
              [3] limit 1
              [3] coscan
      in
          [3] project [s22 = getField (s21, "a")]
          [3] limit 1
          [3] coscan
      

            Assignee:
            yoonsoo.kim@mongodb.com Yoon Soo Kim
            Reporter:
            yoonsoo.kim@mongodb.com Yoon Soo Kim
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: