[SERVER-73241] Better path tracking when $$ROOT is used in a $group accumulator Created: 24/Jan/23  Updated: 31/Oct/23

Status: Open
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Chris Harris Assignee: Backlog - Query Optimization
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Query Optimization
Participants:

 Description   

Consider the following two aggregations:

db.foo.aggregate([
      { '$group': { _id: '$city', data: { '$first': '$$ROOT' } } },
      {
        '$match': { _id: { '$in': [ 'Austin', 'Dallas' ] } }
      }
])

db.foo.aggregate([
      { '$group': { _id: '$city', data: { '$first': '$$ROOT' } } },
      {
        '$match': { 'data.city': { '$in': [ 'Austin', 'Dallas' ] } }
      }
])

Logically they represent the same result set as data.city contains the same value as _id after the $group since both refer to the city field in the originating document and that field is being grouped on.

The former pipeline can participate in the $match$group optimization implemented via SERVER-34741 while the latter currently cannot. This ticket will improve our dependency tracking and analysis to allow for pipelines such as the latter to participate in such an optimization.

This enhancement will apply to both single field groupings as shown as well as compound groupings where similar logic can be applied.


Generated at Thu Feb 08 06:24:05 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.