[SERVER-79552] $group rewrite for timeseries returns incorrect result if referencing the metaField in an object Created: 31/Jul/23  Updated: 29/Oct/23  Resolved: 19/Sep/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 5.0.19, 6.0.8, 7.0.0-rc11
Fix Version/s: 7.0.2, 5.0.22, 6.0.11

Type: Bug Priority: Major - P3
Reporter: Gil Alon Assignee: Erin Zhu
Resolution: Fixed Votes: 0
Labels: greenerbuild
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Related
is related to SERVER-78234 extend min/max pushdown for $group fo... Closed
Assigned Teams:
Query Integration
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v7.1, v7.0, v6.0, v5.0
Steps To Reproduce:

// create a time-series collection.
db.createCollection("timeseries", {timeseries: {timeField: "time",         metaField: metaField}});
 
// insert some documents.
 
// run both queries.
coll.aggregate([{$group: {_id: { d: '$meta1.a.b' }, accmin: {$min: '$c'}}}]);
 
coll.aggregate([{$group: {_id: '$meta1.a.b', accmin: {$min: '$c'}}}]);

Both queries will return:

{$group: {_id: '$meta.a.b', accmin: {$min: '$meta1.f1'}}}

Sprint: QI 2023-08-21, QI 2023-09-04, QI 2023-09-18, QI 2023-10-02
Participants:

 Description   

The $group rewrite for time-series does not take into account the difference between  _id: { d: '$meta1.a.b' } and _id: '$meta1.a.b'. The rewrite returns the same result for both group queries, even though the returned documents should have different _id fields. Steps to reproduce are below.

This bug will be fixed in 7.1 by SERVER-78234, since that ticket rewrites much of this logic, but previous versions need to be corrected. The problem is that the implementation of DocumentSourceGroup::getIdFields returns a vector of size 1 in both cases, and the rewrite logic assumes the _id field looks like _id: '$meta1.a.b'. We can use the _idFieldNames in the group processor to find differentiate between these two cases.



 Comments   
Comment by Githook User [ 19/Sep/23 ]

Author:

{'name': 'Erin Zhu', 'email': 'erin.zhu@mongodb.com', 'username': 'erinzhu001'}

Message: SERVER-79552 [v7.0]: Fix $group for time-series rewrite to account for differences in _id fields

(cherry picked from commit de061cfb49506f60788b1cca74d9fe661ff5f873)
Branch: v5.0
https://github.com/mongodb/mongo/commit/319811c1a1eb229062aad78ab2f111e36c93b036

Comment by Githook User [ 19/Sep/23 ]

Author:

{'name': 'Erin Zhu', 'email': 'erin.zhu@mongodb.com', 'username': 'erinzhu001'}

Message: SERVER-79552 [v7.0]: Fix $group for time-series rewrite to account for differences in _id fields

(cherry picked from commit de061cfb49506f60788b1cca74d9fe661ff5f873)
Branch: v6.0
https://github.com/mongodb/mongo/commit/2b8f55934a0559bb02376e000a40ac1c0bce52e6

Comment by Githook User [ 14/Sep/23 ]

Author:

{'name': 'Erin Zhu', 'email': 'erin.zhu@mongodb.com', 'username': 'erinzhu001'}

Message: SERVER-79552 [v7.0]: Fix $group for time-series rewrite to account for differences in _id fields
Branch: v7.0
https://github.com/mongodb/mongo/commit/de061cfb49506f60788b1cca74d9fe661ff5f873

Generated at Thu Feb 08 06:41:16 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.