[SERVER-54769] A pipeline with a $sort and $project against a time-series collection leaves new project in the pipeline Created: 24/Feb/21  Updated: 06/Dec/22  Resolved: 28/Apr/21

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Eric Cox (Inactive) Assignee: Backlog - Query Optimization
Resolution: Duplicate Votes: 0
Labels: qopt-team
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates SERVER-54864 Peform dependancy analysis and includ... Closed
Assigned Teams:
Query Optimization
Operating System: ALL
Steps To Reproduce:

To debug this I stepped through the js commands in timeseries_bucket_limit_count.js up to line 51. 

This can also be reproduced by running an agg pipeline against the system.buckets.X collection directly,

 db.system.buckets.timeseries_bucket_limit_count_1.aggregate([ { "$_internalUnpackBucket" : { "timeField" : "time", "exclude" : [ ] } }, { "$sort" : { "_id" : 1 } }, { "$project" : { "x" : 1 } } ])

The explain for the query above will help you see the redundant $project.

Participants:

 Description   

Here's an example where there are two $projects in an agg pipeline against a time-series view.

> coll.explain().aggregate([ { “$sort” : { “_id” : 1 } }, { “$project” : { “x” : 1 } } ])
{
	“explainVersion” : “1",
	“stages” : [
		{
			“$cursor” : {
				“queryPlanner” : {
					“namespace” : “test.system.buckets.timeseries_bucket_limit_count_1”,
					“indexFilterSet” : false,
					“parsedQuery” : {
					},
					“queryHash” : “8B3D4AB8”,
					“planCacheKey” : “8B3D4AB8”,
					“maxIndexedOrSolutionsReached” : false,
					“maxIndexedAndSolutionsReached” : false,
					“maxScansToExplodeReached” : false,
					“winningPlan” : {
						“stage” : “COLLSCAN”,
						“direction” : “forward”
					},
					“rejectedPlans” : [ ]
				}
			}
		},
		{
			“$_internalUnpackBucket” : {
				“include” : [
					“_id”,
					“x”
				],
				“timeField” : “time”
			}
		},
		{
			“$project” : {
				“_id” : true,
				“x” : true
			}
		},
		{
			“$sort” : {
				“sortKey” : {
					“_id” : 1
				}
			}
		},
		{
			“$project” : {
				“_id” : true,
				“x” : true
			}
		}
	],
	“serverInfo” : {
		“host” : “cedar”,
		“port” : 27017,
		“version” : “4.9.0-alpha4-383-g142e95a”,
		“gitVersion” : “142e95a1dc2a2ba9c90ea610c30ee697e880e967”
	},
	“command” : {
		“aggregate” : “system.buckets.timeseries_bucket_limit_count_1",
		“pipeline” : [
			{
				“$_internalUnpackBucket” : {
					“timeField” : “time”,
					“exclude” : [ ]
				}
			},
			{
				“$sort” : {
					“_id” : 1
				}
			},
			{
				“$project” : {
					“x” : 1
				}
			}
		],
		“cursor” : {
		},
		“collation” : {
			“locale” : “simple”
		}
	},
	“ok” : 1
}

The new project which appears first is internalized but isn't removed from the pipeline.



 Comments   
Comment by David Storch [ 28/Apr/21 ]

Since this was fixed by SERVER-54864, I'm closing it with the resolution "Duplicate" rather than "Fixed".

Comment by Hana Pearlman [ 03/Mar/21 ]

This bug is fixed by SERVER-54864.

Generated at Thu Feb 08 05:34:25 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.