Loading...

XML

Word

Printable

JSON

Type: Improvement
Resolution: Unresolved
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:
None

Assigned Teams:

Query Execution
Sprint:
QE 2023-08-21, QE 2023-09-04, QE 2023-09-18, QE 2023-10-02, QE 2023-10-16, QE 2023-10-30
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

in this group query we are grouping several fields with a common subpath

db.a.explain().aggregate({$match: {"magic.position": "air"}}, {$group: {_id: {a: "$foo.b.c", d: "$foo.e.f", g: "$foo.h.i"}}}).queryPlanner.winningPlan.slotBasedPlan.stages
[2] mkbson s13 [_id = s12] true false
[2] project [s12 = newObj("a", s9, "d", s10, "g", s11)]
[2] group [s9, s10, s11] []
[2] project [s9 = traverseP(s4, lambda(l101.0) { traverseP(getField(move(l101.0), "b"), lambda(l102.0) { getField(move(l102.0), "c") }, 1) }, 1), s10 = traverseP(s4, lambda(l101.0) { traverseP(getField(move(l101.0), "e"), lambda(l102.0) { getField(move(l102.0), "f") }, 1) }, 1), s11 = traverseP(s4, lambda(l101.0) { traverseP(getField(move(l101.0), "h"), lambda(l102.0) { getField(move(l102.0), "i") }, 1) }, 1)]
[1] filter {traverseF(s5, lambda(l1.0) { traverseF(getField(l1.0, "position"), lambda(l2.0) { ((l2.0 == s8) ?: false) }, false) }, false)}
[1] scan s6 s7 none none none none lowPriority [s4 = foo, s5 = magic] @"5ddad002-6f31-4499-a881-8cb09547796e" true false

each field on the sub-object "foo" has their own traverse. Even worse if there was a common subpath after foo (eg "foo.bar.a", "foo.bar.b", "foo.bar.c") we traverse for the bar field multiple times, when it could just be done once. These n^2 lookups become more painful when documents are large and there are many paths to lookup.

One solution is to try to generate plans that look more like this projection query.

db.a.explain().aggregate({$project: {"foo.b.c": 1, "foo.e.f": 1, "foo.h.i": 1, _id: 0}}).queryPlanner.winningPlan.slotBasedPlan.stages
[2] project [s6 = traverseP(s4, lambda(l1.0) {
    if isObject(l1.0)
    then makeBsonObj(MakeObjSpec(keep, [], ["foo"]), l1.0, traverseP(getField(l1.0, "foo"), lambda(l2.0) {
        if isObject(l2.0)
        then makeBsonObj(MakeObjSpec(keep, [], ["b", "e", "h"]), l2.0, traverseP(getField(l2.0, "b"), lambda(l3.0) {
            if isObject(l3.0)
            then makeBsonObj(MakeObjSpec(keep, ["c"], []), l3.0)
            else Nothing
        }, Nothing), traverseP(getField(l2.0, "e"), lambda(l4.0) {
            if isObject(l4.0)
            then makeBsonObj(MakeObjSpec(keep, ["f"], []), l4.0)
            else Nothing
        }, Nothing), traverseP(getField(l2.0, "h"), lambda(l5.0) {
            if isObject(l5.0)
            then makeBsonObj(MakeObjSpec(keep, ["i"], []), l5.0)
            else Nothing
        }, Nothing))
        else Nothing
    }, Nothing))
    else Nothing
}, Nothing)]
[1] scan s4 s5 none none none none lowPriority [] @"5ddad002-6f31-4499-a881-8cb09547796e" true false

this still however has a similar problem where each call to getField is done independently and can result in the full bson or subobject in the bson being scanned multiple times.

there is logic in the classic engine here to avoid scanning a document multiple times if possible while doing an inclusion projection https://github.com/mongodb/mongo/blob/2e0259b3050e4c27d47e353222395d21bb80b9e4/src/mongo/db/exec/fastpath_projection_node.h#L88

is depended on by

SERVER-84021 Depricate $_internalShredDocuments

Backlog

is duplicated by

SERVER-83533 Improve efficiency of SBE when running projections with same prefix

Closed

is related to

SERVER-94416 Add scenario that accesses many subfields with same parent field

Closed

SERVER-79975 [Classic] Optimize large group keys

Closed

related to

SERVER-79975 [Classic] Optimize large group keys

Closed

Assignee:: [DO NOT USE] Backlog - Query Execution
Reporter:: Mickey Winters
Participants:: [DO NOT USE] Backlog - Query Execution, Alberto Massari, Mickey Winters
Votes:: 3 Vote for this issue
Watchers:: 21 Start watching this issue

Created:: Jul 31 2023 09:10:15 PM UTC
Updated:: Mar 04 2025 09:02:52 AM UTC

Details

Description

Attachments

Issue Links

Forms

Activity

People

Dates