Loading...

XML

Word

Printable

JSON

Type: Improvement
Resolution: Unresolved
Priority: Major - P3
Fix Version/s: None
Affects Version/s: 2.6.0
Component/s: Aggregation Framework
Labels:

Assigned Teams:

Query Optimization
Backwards Compatibility:
Fully Compatible
Sprint:
Query 2017-03-27, Query 2017-04-17, QO 2022-10-03
Case:
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

Even though the user explicitly tells us to include specific fields, we can see that they are not actually being used, so doesn't it make sense to optimize $project away the same as if it wasn't there?

db.t1.aggregate([ {$match:{username:/^user8/}}, {$project:{username:1}},{$group:{_id:1,count:{$sum:1}}}],{explain:true})
{
	"stages" : [
		{
			"$cursor" : {
				"query" : {
					"username" : /^user8/
				},
				"fields" : {
					"username" : 1,
					"_id" : 1
				},
				"plan" : {
					"cursor" : "BtreeCursor username_1",
					"isMultiKey" : false,
					"scanAndOrder" : false,
					"indexBounds" : {
						"username" : [
							[
								"user8",
								"user9"
							],
							[
								/^user8/,
								/^user8/
							]
						]
					},
					"allPlans" : [
						{
							"cursor" : "BtreeCursor username_1",
							"isMultiKey" : false,
							"scanAndOrder" : false,
							"indexBounds" : {
								"username" : [
									[
										"user8",
										"user9"
									],
									[
										/^user8/,
										/^user8/
									]
								]
							}
						}
					]
				}
			}
		},
		{
			"$project" : {
				"username" : true
			}
		},
		{
			"$group" : {
				"_id" : {
					"$const" : 1
				},
				"count" : {
					"$sum" : {
						"$const" : 1
					}
				}
			}
		}
	]

Without the (needless) $project

db.t1.aggregate([ {$match:{username:/^user8/}}, {$group:{_id:1,count:{$sum:1}}}],{explain:true})
{
	"stages" : [
		{
			"$cursor" : {
				"query" : {
					"username" : /^user8/
				},
				"fields" : {
					"_id" : 0,
					"$noFieldsNeeded" : 1
				},
				"plan" : {
					"cursor" : "BtreeCursor username_1",
					"isMultiKey" : false,
					"scanAndOrder" : false,
					"indexBounds" : {
						"username" : [
							[
								"user8",
								"user9"
							],
							[
								/^user8/,
								/^user8/
							]
						]
					},
					"allPlans" : [
						{
							"cursor" : "BtreeCursor username_1",
							"isMultiKey" : false,
							"scanAndOrder" : false,
							"indexBounds" : {
								"username" : [
									[
										"user8",
										"user9"
									],
									[
										/^user8/,
										/^user8/
									]
								]
							}
						}
					]
				}
			}
		},
		{
			"$group" : {
				"_id" : {
					"$const" : 1
				},
				"count" : {
					"$sum" : {
						"$const" : 1
					}
				}
			}
		}
	],
	"ok" : 1
}

is duplicated by

SERVER-14159 Aggregation framework performances drops significantly when projecting large sub documents

Closed

SERVER-49306 Optimization for mid-pipeline $project stages

Closed

SERVER-82836 UNPACK_TS_BUCKET stage includes fields it doesn't need

Closed

related to

SERVER-31082 when $count is at the end of multiple stages that don't change the number of documents in pipeline, those stages can be eliminated

Backlog

SERVER-55886 Optimize away unused computed fields

Backlog

SERVER-25120 aggregation requests generated field name from query

Closed

(1 related to)

Assignee:: [DO NOT USE] Backlog - Query Optimization
Reporter:: Asya Kamsky
Participants:: [DO NOT USE] Backlog - Query Optimization, Asya Kamsky, David Storch, Githook User, Mathias Stearn, Maxime Beaudry, Thomas Rueckstiess
Votes:: 2 Vote for this issue
Watchers:: 26 Start watching this issue

Created:: Apr 23 2014 08:18:43 PM UTC
Updated:: Aug 15 2025 01:33:46 PM UTC

Details

Description

Attachments

Issue Links

Activity

People

Dates