[SERVER-60895] Multiple accumulators on the same field generate a plan with duplicated getField() calls Created: 21/Oct/21  Updated: 06/Dec/22  Resolved: 21/Oct/21

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Irina Yatsenko (Inactive) Assignee: Backlog - Query Execution
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
backported by SERVER-60893 Deduplicate field lookups for a same ... Closed
Assigned Teams:
Query Execution
Participants:

 Description   

Start mongod with "--setParameter internalQueryForceClassicEngine=false --setParameter featureFlagSBEGroupPushdown=true" and run the following query (if $min, $max accumulators aren't enabled yet, replace them with $sum)

db.LS.explain().aggregate([\{$group: {_id:"$a", o1:{$min:"$c"}, o2:\{$max:"$c"}}}]).queryPlanner.winningPlan.slotBasedPlan.stages

The output will look something like this:

[2] mkbson s14 [_id = s7, o1 = s12, o2 = s13] true false
[2] project [s12 = fillEmpty (s9, null), s13 = fillEmpty (s11, null)]
[2] group [s7] [s9 = min (let [l1.0 = s8] if (! exists (l1.0) || typeMatch (l1.0, 0x00000440), Nothing, l1.0)), s11 = max (let [l2.0 = s10] if (! exists (l2.0) || typeMatch (l2.0, 0x00000440), Nothing, l2.0))]
[2] project [s10 = getField (s4, "c")]
[2] project [s8 = getField (s4, "c")]
[2] project [s7 = fillEmpty (s6, null)]
[2] project [s6 = getField (s4, "a")]
[1] scan s4 s5 none none none none [] @"c494dfc1-7ed7-45e7-a46d-b253a1e532db" true false

Notice the duplicated getField on the same field "c". We should eliminate this inefficiency.

 

 



 Comments   
Comment by Irina Yatsenko (Inactive) [ 21/Oct/21 ]

Yoonsoo beat me to it: https://jira.mongodb.org/browse/SERVER-60893

Comment by Irina Yatsenko (Inactive) [ 21/Oct/21 ]

If the query involves duplicated access on a sub-field the plan looks like:

 

db.LS.explain().aggregate([\{$group: {_id:"$a", o1:{$min:"$e.c"}, o2:\{$max:"$e.c"}}}]).queryPlanner.winningPlan.slotBasedPlan.stages

 

[3] mkbson s23 [_id = s12, o1 = s21, o2 = s22] true false
[3] project [s21 = fillEmpty (s16, null), s22 = fillEmpty (s20, null)]
[3] group [s12] [s16 = min (let [l1.0 = s15] if (! exists (l1.0) || typeMatch (l1.0, 0x00000440), Nothing, l1.0)), s20 = max (let [l2.0 = s19] if (! exists (l2.0) || typeMatch (l2.0, 0x00000440), Nothing, l2.0))]
[3] traverse s19 s18 s17 [s10, s11, s12, s13, s15] {} {}
from
  [3] project [s17 = getField (s10, "e")]
  [3] traverse s15 s14 s13 [s10, s11, s12] {} {}
  from
    [3] project [s13 = getField (s10, "e")]
    [3] project [s12 = fillEmpty (s11, null)]
    [3] project [s11 = getField (s10, "a")]
    [2] traverse s10 s9 s4 [s5] {} {}
    from
        [1] scan s4 s5 none none none none [] @"c494dfc1-7ed7-45e7-a46d-b253a1e532db" true false
    in
        [2] cfilter {isObject (s4)}
        [2] mkbson s9 s4 [a] keep [e = s8] true false
        [2] traverse s8 s7 s6 {} {}
        from
          [2] project [s6 = getField (s4, "e")]
          [2] limit 1
          [2] coscan
        in
          [2] cfilter {isObject (s6)}
          [2] mkbson s7 s6 [c] keep [] true false
          [2] limit 1
          [2] coscan
  in
    [3] project [s14 = getField (s13, "c")]
    [3] limit 1
    [3] coscan
in
  [3] project [s18 = getField (s17, "c")]
  [3] limit 1
  [3] coscan

 

Generated at Thu Feb 08 05:51:01 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.