[SERVER-77183] $project followed by $group gives incorrect results sometimes Created: 16/May/23  Updated: 29/Oct/23  Resolved: 14/Jul/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 6.0.5
Fix Version/s: 7.1.0-rc0, 7.0.1, 6.0.10

Type: Bug Priority: Major - P3
Reporter: Oleksandr Poliakov Assignee: Rui Liu
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Related
related to SERVER-75267 Summing $bsonSize in $group stage wit... Closed
is related to CSHARP-4656 Simplify A : "$A" to A : 1 only on find Closed
Tested
Assigned Teams:
Query Execution
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v7.0, v6.0
Steps To Reproduce:

Enterprise test> db.test.find()
[
    { _id: 1, A: 2, B: 3 },
    { _id: 2, A: 2, B: 3 },
    { _id: 3, A: 4, B: 3 }
]
Enterprise test> db.test.aggregate([{ $project : { A : "$A", B : "$B", _id : 0 } }, { $group : { _id : "$$ROOT" } }])
[ 
    { _id: { A: 4, B: 3 } },
    { _id: { A: 2, B: 3 } }
]
Enterprise test> db.test.aggregate([{ $project : { A : 1, B : 1, _id : 0 } }, { $group : { _id : "$$ROOT" } }])
[
    { _id: { _id: 1, A: 2, B: 3 } },
    { _id: { _id: 2, A: 2, B: 3 } },
    { _id: { _id: 3, A: 4, B: 3 } }
]
Enterprise test>

Sprint: QO 2023-06-12, QE 2023-07-24
Participants:

 Description   

See steps to reproduce.

In the steps below we expected both aggregation pipelines return the same results. In server version before v6 it was working as expected. The problem seems to started with v6 (we tested 6.0.2 and 6.0.5 versions).

The difference between two aggregation pipelines is that in the first case we project using A : "$A" and in the second we project using A : 1. We expect them to be equivalent.



 Comments   
Comment by Githook User [ 22/Aug/23 ]

Author:

{'name': 'Peter Volk', 'email': 'peter.volk@mongodb.com', 'username': 'HCSPete'}

Message: SERVER-77183 $project followed by $group gives incorrect results sometimes
Branch: v6.0
https://github.com/mongodb/mongo/commit/7130535cf9c50d92b773c33d507381a48de78788

Comment by Githook User [ 21/Aug/23 ]

Author:

{'name': 'Rui Liu', 'email': 'lriuui0x0@gmail.com', 'username': 'lriuui0x0'}

Message: SERVER-77183 Only eliminate projection before group when group doesn't require full document

(cherry picked from commit 107f24c1bf0c8f7041978b2dfb718bb6694e0ba5)
Branch: v7.0
https://github.com/mongodb/mongo/commit/00bd991c1fb5789bab9d93f43efe82ae509085ab

Comment by Kyle Suarez [ 18/Aug/23 ]

Requesting backports to 6.0 and 7.0 given the affects version is 6.0.5.

Comment by Githook User [ 14/Jul/23 ]

Author:

{'name': 'Rui Liu', 'email': 'lriuui0x0@gmail.com', 'username': 'lriuui0x0'}

Message: SERVER-77183 Only eliminate projection before group when group doesn't require full document
Branch: master
https://github.com/mongodb/mongo/commit/107f24c1bf0c8f7041978b2dfb718bb6694e0ba5

Comment by Rushan Chen [ 04/Jul/23 ]

Background: project seems to be lost and we should make sure SBE group push down didn't cause this. SBE Group pushdown is owned by QE.

Comment by Projjal Chanda [ 03/Jul/23 ]

anton.korshunov@mongodb.com . This looks like a QO issue. The query plan for the second query is wrong as the projection stage is missing.

Comment by Anton Korshunov [ 20/Jun/23 ]

Sending to QE for investigation as the plan is executed with the SBE engine in v6.0.

Comment by James Kovacs [ 16/May/23 ]

Looking at the explain plans for the two queries:

Enterprise test> db.test.explain().aggregate([{$project: {A:"$A", B:"$B", _id:0}}, {$group: {_id:"$$ROOT"}}])
...
    winningPlan: {
      queryPlan: {
        stage: 'GROUP',
        planNodeId: 3,
        inputStage: {
          stage: 'PROJECTION_DEFAULT',
          planNodeId: 2,
          transformBy: { A: '$A', B: '$B', _id: false },
          inputStage: {
            stage: 'COLLSCAN',
            planNodeId: 1,
            filter: {},
            direction: 'forward'
          }
        }
      },
...

Notice how GROUP takes a PROJECTION_DEFAULT as its input. Now compare that to…

Enterprise test> db.test.explain().aggregate([{$project: {A:1, B:1, _id:0}}, {$group: {_id:"$$ROOT"}}])
...
    winningPlan: {
      queryPlan: {
        stage: 'GROUP',
        planNodeId: 2,
        inputStage: {
          stage: 'COLLSCAN',
          planNodeId: 1,
          filter: {},
          direction: 'forward'
        }
      },
...

The GROUP takes a COLLSCAN as input disregarding the $project entirely.

Generated at Thu Feb 08 06:34:45 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.