[SERVER-75267] Summing $bsonSize in $group stage with projected fields fails due to an invalid pushdown of $project Created: 24/Mar/23  Updated: 22/Aug/23  Resolved: 22/Aug/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 5.2.0, 6.0.5
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Stuart Wheaton Assignee: Peter Volk
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to SERVER-77183 $project followed by $group gives inc... Closed
Assigned Teams:
Query Optimization
Operating System: ALL
Steps To Reproduce:

Use any collection with at least 1 non-ID field. For example, in mongosh

 

db.test.insertOne({foo: "bar"}) 

First see that the object size with only _id projected field is 22:

 

 

test> pipeline = [
  { '$project': { _id: 1 } },
  { '$project': { size_bytes: { '$bsonSize': '$$ROOT' } } }
]
test> db.test.aggregate(pipeline)
[ { _id: ObjectId("641dc5f51b47ae483a7f5235"), size_bytes: 22 } ]

 

But the whole document including field "foo" is 35:

test> pipeline = [
  { '$project': { size_bytes: { '$bsonSize': '$$ROOT' } } }
]
test> db.test.aggregate(pipeline)
[ { _id: ObjectId("641dc5f51b47ae483a7f5235"), size_bytes: 35 } ] 

Now sum up the size of all documents (there's only 1) which is 35 but should be 22:

test> db.version()
6.0.5
test> pipeline = [
  { '$project': { _id: 1 } },
  {
    '$group': { _id: null, size_bytes: { '$sum': { '$bsonSize': '$$ROOT' } } }
  }
]
test> db.test.aggregate(pipeline)
[ { _id: null, size_bytes: 35 } ]

In version 5.0 it reports size 22:

test> db.version()
5.0.15
test> db.test.aggregate(pipeline)
[ { _id: null, size_bytes: 22 } ] 

Using $unset stage to exclude a field does work as expected:

test> pipeline = [
  { '$unset': 'foo' },
  {
    '$group': { _id: null, size_bytes: { '$sum': { '$bsonSize': '$$ROOT' } } }
  }
]
test> db.test.aggregate(pipeline)
[ { _id: null, size_bytes: 22 } ]

Potential workaround of 2 stages to do the sum rather than 1:

[
  { '$project': { _id: 1 } },
  { '$project': { size: { '$bsonSize': '$$ROOT' } } },
  { '$group': { _id: null, size_bytes: { '$sum': '$size' } } }
] 

Participants:

 Description   

Grouping sum of $bsonSize gets the sum of full documents not just the projected fields.

In MongoDB version 5.0, using the sum of document sizes with a group stage as recommended on this page works when fields are projected (see steps below).

However in version 6.0, it merely returns the sum of the whole document size, disregarding any field projections. I believe this may have to do with the change in release 5.2 to using Slot Based Execution query engine to execute $group stages. In 5.0 the regular query execution engine is used vs. 6.0 the SBE is used.

What is interesting is that if you use a $unset stage to exclude fields then the size is calculated correctly. So a potential workaround is to exclude fields rather than include fields.

Another potential workaround is to use 2 stages (project document size then sum that size) rather than the 1 group stage - example shown below.



 Comments   
Comment by Peter Volk [ 22/Aug/23 ]

Hi stuart@voxel51.com,

It turns out that this is an issue that was solved with SERVER-77183. The bugfix was backported to 7.0 and the 6.0 branch and hence will be included in the next respective releases (7.1, 7.0.1, 6.0.10) I'll be closing this ticket as duplicate.

Thanks for the report and pointing this out.

Best,
Peter

Comment by Peter Volk [ 21/Jul/23 ]

Hi stuart@voxel51.com,

Thanks for the report. This is being considered a bug. The QO team will look into this to check what the options are and provide a fix. I'll keep you posted on the progress. 

Thanks,
Peter

Comment by Stuart Wheaton [ 18/Jul/23 ]

Thanks for looking into it mihai.andrei@mongodb.com !

I think it's making sense to me - the issue at least. But I'm not clear on what you're saying here. Is it expected behavior? Or you're saying an inapplicable optimization is being applied that shouldn't be?

The results of $project stage does not seem to always be discarded by $group stage, so it must be this specific case where the optimization is incorrectly applied.

If this (described in ticket) is expected behavior, I'm happy to submit a documentation PR to better clarify this specific use case.

Comment by Mihai Andrei [ 18/Jul/23 ]

Hi stuart@voxel51.com,

Thanks for the report! After taking a look at this, this doesn't appear to be a bug with SBE. Rather, this is the result of the query planner generating different plans across versions. In one case, our query plan looks like:

[js_test:repro] [jsTest] 	"stages" : [
[js_test:repro] [jsTest] 		{
[js_test:repro] [jsTest] 			"$cursor" : {
[js_test:repro] [jsTest] 				"queryPlanner" : {
[js_test:repro] [jsTest] 					"namespace" : "test.repro",
[js_test:repro] [jsTest] 					"indexFilterSet" : false,
[js_test:repro] [jsTest] 					"parsedQuery" : {
[js_test:repro] [jsTest]
[js_test:repro] [jsTest] 					},
[js_test:repro] [jsTest] 					"queryHash" : "CC21B94D",
[js_test:repro] [jsTest] 					"planCacheKey" : "CC21B94D",
[js_test:repro] [jsTest] 					"maxIndexedOrSolutionsReached" : false,
[js_test:repro] [jsTest] 					"maxIndexedAndSolutionsReached" : false,
[js_test:repro] [jsTest] 					"maxScansToExplodeReached" : false,
[js_test:repro] [jsTest] 					"winningPlan" : {
[js_test:repro] [jsTest] 						"stage" : "PROJECTION_SIMPLE",
[js_test:repro] [jsTest] 						"transformBy" : {
[js_test:repro] [jsTest] 							"_id" : true
[js_test:repro] [jsTest] 						},
[js_test:repro] [jsTest] 						"inputStage" : {
[js_test:repro] [jsTest] 							"stage" : "COLLSCAN",
[js_test:repro] [jsTest] 							"direction" : "forward"
[js_test:repro] [jsTest] 						}
[js_test:repro] [jsTest] 					},
[js_test:repro] [jsTest] 					"rejectedPlans" : [ ]
[js_test:repro] [jsTest] 				}
[js_test:repro] [jsTest] 			}
[js_test:repro] [jsTest] 		},
[js_test:repro] [jsTest] 		{
[js_test:repro] [jsTest] 			"$group" : {
[js_test:repro] [jsTest] 				"_id" : "$_id",
[js_test:repro] [jsTest] 				"size_bytes" : {
[js_test:repro] [jsTest] 					"$sum" : {
[js_test:repro] [jsTest] 						"$bsonSize" : [
[js_test:repro] [jsTest] 							"$$ROOT"
[js_test:repro] [jsTest] 						]
[js_test:repro] [jsTest] 					}
[js_test:repro] [jsTest] 				}
[js_test:repro] [jsTest] 			}
[js_test:repro] [jsTest] 		}
[js_test:repro] [jsTest] 	],

In which case, the documents coming into $group (and the $bsonSize expression) only contain the _id field (and the objects are all of size 22). But then, with SBE enabled, we have a different plan:

[js_test:repro] [jsTest] 	"queryPlanner" : {
[js_test:repro] [jsTest] 		"namespace" : "test.repro",
[js_test:repro] [jsTest] 		"indexFilterSet" : false,
[js_test:repro] [jsTest] 		"parsedQuery" : {
[js_test:repro] [jsTest]
[js_test:repro] [jsTest] 		},
[js_test:repro] [jsTest] 		"queryHash" : "EF4BF651",
[js_test:repro] [jsTest] 		"planCacheKey" : "EF4BF651",
[js_test:repro] [jsTest] 		"optimizedPipeline" : true,
[js_test:repro] [jsTest] 		"maxIndexedOrSolutionsReached" : false,
[js_test:repro] [jsTest] 		"maxIndexedAndSolutionsReached" : false,
[js_test:repro] [jsTest] 		"maxScansToExplodeReached" : false,
[js_test:repro] [jsTest] 		"winningPlan" : {
[js_test:repro] [jsTest] 			"queryPlan" : {
[js_test:repro] [jsTest] 				"stage" : "GROUP",
[js_test:repro] [jsTest] 				"planNodeId" : 2,
[js_test:repro] [jsTest] 				"inputStage" : {
[js_test:repro] [jsTest] 					"stage" : "COLLSCAN",
[js_test:repro] [jsTest] 					"planNodeId" : 1,
[js_test:repro] [jsTest] 					"filter" : {
[js_test:repro] [jsTest]
[js_test:repro] [jsTest] 					},
[js_test:repro] [jsTest] 					"direction" : "forward"
[js_test:repro] [jsTest] 				}
[js_test:repro] [jsTest] 			},
[js_test:repro] [jsTest] 			"slotBasedPlan" : {
[js_test:repro] [jsTest] 				"slots" : "$$RESULT=s11 env: { s1 = TimeZoneDatabase(Africa/Johannesburg...America/North_Dakota/New_Salem) (timeZoneDB), s2 = Nothing (SEARCH_META), s3 = 1682358420270 (NOW) }",
[js_test:repro] [jsTest] 				"stages" : "[2] mkbson s11 [_id = s7, size_bytes = s10] true false \n[2] project [s10 = doubleDoubleSumFinalize (s8)] \n[2] group [s7] [s8 = aggDoubleDoubleSum (let [l1.0 = s4] if (! exists (l1.0) || typeMatch (l1.0, 1088), null, if (! isObject (l1.0), fail ( 5043001 ,$bsonSize requires a document input), bsonSize (l1.0))))] spillSlots[s9] mergingExprs[aggMergeDoubleDoubleSums (s9)] \n[2] project [s7 = fillEmpty (s6, null)] \n[2] project [s6 = getField (s4, \"_id\")] \n[1] scan s4 s5 none none none none [] @\"3778419a-57db-4be6-8660-2468fe6dfc9f\" true false "
[js_test:repro] [jsTest] 			}
[js_test:repro] [jsTest] 		},
[js_test:repro] [jsTest] 		"rejectedPlans" : [ ]
[js_test:repro] [jsTest] 	},
[js_test:repro] [jsTest] 	"command" : {
[js_test:repro] [jsTest] 		"aggregate" : "repro",
[js_test:repro] [jsTest] 		"pipeline" : [
[js_test:repro] [jsTest] 			{
[js_test:repro] [jsTest] 				"$project" : {
[js_test:repro] [jsTest] 					"_id" : 1
[js_test:repro] [jsTest] 				}
[js_test:repro] [jsTest] 			},
[js_test:repro] [jsTest] 			{
[js_test:repro] [jsTest] 				"$group" : {
[js_test:repro] [jsTest] 					"_id" : "$_id",
[js_test:repro] [jsTest] 					"size_bytes" : {
[js_test:repro] [jsTest] 						"$sum" : {
[js_test:repro] [jsTest] 							"$bsonSize" : "$$ROOT"
[js_test:repro] [jsTest] 						}
[js_test:repro] [jsTest] 					}
[js_test:repro] [jsTest] 				}
[js_test:repro] [jsTest] 			}
[js_test:repro] [jsTest] 		],

Note how here we are $grouping over the documents from the collection (no projection applied). The reason for this is that the query optimizer decided to eliminate the $project before the $group as it is redundant (the documents produced by any project will be immediately discarded). Arguably, the use of $bsonSize over $$ROOT should inhibit this optimization, as the result of the expression directly depends on the documents produced by a previous stage. Regardless, this is not a bug with SBE as this is the result of a change in the behavior of the optimizer.

Comment by Chris Kelly [ 24/Mar/23 ]

Hi stuart@voxel51.com,

Thanks for your report! I'll pass this to the relevant team to take a look.

Generated at Thu Feb 08 06:29:44 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.