[SERVER-71270] Time series optimization pushes $match on timeField before $project stage that removes the field from pipeline Created: 10/Nov/22  Updated: 29/Oct/23  Resolved: 16/Nov/22

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 6.2.0-rc2, 6.3.0-rc0, 6.0.7

Type: Bug Priority: Major - P3
Reporter: Davis Haupt (Inactive) Assignee: Ivan Fefer
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Problem/Incident
is caused by SERVER-70269 Avoid applying match filter to the un... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v6.2, v6.0
Steps To Reproduce:

const doc = {
  _id: 0,
  time: new Date('2019-01-18T13:24:15.443Z'),
  tag: {},
};
 
db.ts.drop();
db.coll.drop();
 
db.createCollection('ts', {timeseries: {timeField: 'time', metaField: 'tag'}});
db.createCollection('coll');
 
db.ts.insertOne(doc);
db.coll.insertOne(doc);
const pipeline = [
  {$project: {'time': 0}},
  {$match: {'time': {$lte: new Date('2019-02-13T11:36:03.481Z')}}},
];
 
const ts = db.ts.aggregate(pipeline).toArray();
// [ { "tag" : { }, "_id" : 0 } ]
const vanilla = db.coll.aggregate(pipeline).toArray();
// [ ]

Sprint: QO 2022-11-14, QE 2022-11-28
Participants:
Linked BF Score: 135

 Description   

With the query:

[
  {$project: {'time': 0}},
  {$match: {'time': {$lte: new Date('2019-02-13T11:36:03.481Z')}}},
]

The optimized plan looks like:

"stages" : [
	{
		"$cursor" : {
			"queryPlanner" : {
				"namespace" : "test.system.buckets.ts",
				"indexFilterSet" : false,
				"parsedQuery" : {
					"$and" : [
						{
							"_id" : {
								"$lte" : ObjectId("5c640124ffffffffffffffff")
							}
						},
						{
							"control.max.time" : {
								"$_internalExprLte" : ISODate("2019-02-13T12:36:03.481Z")
							}
						},
						{
							"control.min.time" : {
								"$_internalExprLte" : ISODate("2019-02-13T11:36:03.481Z")
							}
						}
					]
				},
				"queryHash" : "A79A3A87",
				"planCacheKey" : "A79A3A87",
				"maxIndexedOrSolutionsReached" : false,
				"maxIndexedAndSolutionsReached" : false,
				"maxScansToExplodeReached" : false,
				"winningPlan" : {
					"stage" : "CLUSTERED_IXSCAN",
					"filter" : {
						"$and" : [
							{
								"_id" : {
									"$lte" : ObjectId("5c640124ffffffffffffffff")
								}
							},
							{
								"control.max.time" : {
									"$_internalExprLte" : ISODate("2019-02-13T12:36:03.481Z")
								}
							},
							{
								"control.min.time" : {
									"$_internalExprLte" : ISODate("2019-02-13T11:36:03.481Z")
								}
							}
						]
					},
					"direction" : "forward",
					"minRecord" : ObjectId("000000000000000000000000"),
					"maxRecord" : ObjectId("5c640124ffffffffffffffff")
				},
				"rejectedPlans" : [ ]
			}
		}
	},
	{
		"$_internalUnpackBucket" : {
			"exclude" : [
				"time"
			],
			"timeField" : "time",
			"metaField" : "tag",
			"bucketMaxSpanSeconds" : 3600,
			"assumeNoMixedSchemaData" : true,
			"wholeBucketFilter" : {
				"control.max.time" : {
					"$lte" : ISODate("2019-02-13T11:36:03.481Z")
				}
			},
			"eventFilter" : {
				"time" : {
					"$lte" : ISODate("2019-02-13T11:36:03.481Z")
				}
			}
		}
	}
]

It appears as if the $match is getting pushed before the $project. The $cursor stage fetches the matching documents, and then the $_internalUnpackBucket excludes the time field, which represents the $project.

For this query to be correct, a $match should not be pushed before a $project if the $project modifies a field included in the $match.



 Comments   
Comment by Githook User [ 08/Jun/23 ]

Author:

{'name': 'Ivan Fefer', 'email': 'ivan.fefer@mongodb.com', 'username': 'Fefer-Ivan'}

Message: SERVER-71270 In timeseries collections prevent match pushdown before project that can affect it

(cherry picked from commit e332ea00d872740058898541277bf6547774be90)
Branch: v6.0
https://github.com/mongodb/mongo/commit/2c0869c20f060ee49c00b4dc18423b22edd9bbd3

Comment by Githook User [ 23/Nov/22 ]

Author:

{'name': 'Ivan Fefer', 'email': 'ivan.fefer@mongodb.com', 'username': 'Fefer-Ivan'}

Message: SERVER-71270 In timeseries collections prevent match pushdown before project that can affect it
Branch: v6.2
https://github.com/mongodb/mongo/commit/bb0c1608a165ca60c9ec6ded4df9655791b6e16d

Comment by Ivan Fefer [ 16/Nov/22 ]

We will need to backport it to 6.2.0

Comment by Githook User [ 16/Nov/22 ]

Author:

{'name': 'Ivan Fefer', 'email': 'ivan.fefer@mongodb.com', 'username': 'Fefer-Ivan'}

Message: SERVER-71270 In timeseries collections prevent match pushdown before project that can affect it
Branch: master
https://github.com/mongodb/mongo/commit/e332ea00d872740058898541277bf6547774be90

Comment by Davis Haupt (Inactive) [ 11/Nov/22 ]

Since this is related to SERVER-70269, assigning this to ivan.fefer@mongodb.com who was on the review for that change.

Comment by Davis Haupt (Inactive) [ 11/Nov/22 ]

Seems likely to have been caused by SERVER-70269.

Generated at Thu Feb 08 06:18:31 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.