[SERVER-60893] Deduplicate field lookups for a same field path when $group is pushed down to SBE Created: 21/Oct/21  Updated: 29/Oct/23  Resolved: 26/Oct/21

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 5.2.0

Type: Task Priority: Major - P3
Reporter: Yoon Soo Kim Assignee: Yoon Soo Kim
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
backports SERVER-60895 Multiple accumulators on the same fie... Closed
Backwards Compatibility: Fully Compatible
Sprint: QE 2021-11-01
Participants:

 Description   

As of now, the $group SBE stage builder generates multiple field lookup expressions and/or stages for a same field path. It may hurt performance when there are multiple accumulators and they access the same field again and again and this perf degradation becomes much worse when the field paths are non top-level ones than when they are top-level ones.

This is an example of such case:

MongoDB Enterprise > db.t.explain().aggregate([{$group: {_id: "$item", min: {$min: "$price.a"}, max: {$max: "$price.a"}, f: {$first: "$price.a"}}}]).queryPlanner.winningPlan.slotBasedPlan.stages
[3] mkbson s27 [_id = s12, min = s25, max = s26, f = s24] true false
[3] project [s25 = fillEmpty (s16, null), s26 = fillEmpty (s20, null)]
[3] group [s12] [s16 = min (let [l1.0 = s15] if (! exists (l1.0) || typeMatch (l1.0, 0x00000440), Nothing, l1.0)), s20 = max (let [l2.0 = s19] if (! exists (l2.0) || typeMatch (l2.0, 0x00000440), Nothing, l2.0)), s24 = first (fillEmpty (s23, null))]
[3] traverse s23 s22 s21 [s10, s11, s12, s13, s15, s17, s19] {} {}
from
    [3] project [s21 = getField (s10, "price")]
    [3] traverse s19 s18 s17 [s10, s11, s12, s13, s15] {} {}
    from
        [3] project [s17 = getField (s10, "price")]
        [3] traverse s15 s14 s13 [s10, s11, s12] {} {}
        from
            [3] project [s13 = getField (s10, "price")]
            [3] project [s12 = fillEmpty (s11, null)]
            [3] project [s11 = getField (s10, "item")]
            [2] traverse s10 s9 s4 [s5] {} {}
            from
                [1] scan s4 s5 none none none none [] @"7fb2bafd-3565-4723-8153-e775e578c8c9" true false
            in
                [2] cfilter {isObject (s4)}
                [2] mkbson s9 s4 [item] keep [price = s8] true false
                [2] traverse s8 s7 s6 {} {}
                from
                    [2] project [s6 = getField (s4, "price")]
                    [2] limit 1
                    [2] coscan
                in
                    [2] cfilter {isObject (s6)}
                    [2] mkbson s7 s6 [a] keep [] true false
                    [2] limit 1
                    [2] coscan
        in
            [3] project [s14 = getField (s13, "a")]
            [3] limit 1
            [3] coscan
    in
        [3] project [s18 = getField (s17, "a")]
        [3] limit 1
        [3] coscan
in
    [3] project [s22 = getField (s21, "a")]
    [3] limit 1
    [3] coscan


Generated at Thu Feb 08 05:51:00 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.