[SERVER-31282] $project of nested projection regression 3.2 -> 3.4 Created: 27/Sep/17  Updated: 04/Oct/17  Resolved: 04/Oct/17

Status: Closed
Project: Core Server
Component/s: Aggregation Framework
Affects Version/s: 3.4.9
Fix Version/s: None

Type: Bug Priority: Minor - P4
Reporter: Eduardo Gurgel Pinho Assignee: Charlie Swanson
Resolution: Duplicate Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Linux and OS X


Issue Links:
Duplicate
duplicates DOCS-10869 Expand $project changes in 3.2->3.4 c... Closed
Operating System: ALL
Steps To Reproduce:

Given this aggregation:

db.inventory.drop()
db.inventory.insertOne( { item: 1, qty:{ sold: 1 } });
db.inventory.insertOne( { item: 3, qty:{ etc: 3 } });
db.inventory.insertOne( { item: 4, });
 
var pipeline = [{ $project : {"_id" : 0, "0" : { "sold" : "$qty.sold"} } }];
var result = db.inventory.aggregate(pipeline);
 
result.forEach(function(r) { printjson(r) } )

If you run the above aggregation against 3.2.* this is the result:

{ "0" : { "sold" : 1 } }
{ }
{ }

If you run the above aggregation against 3.4.* this is the result:

{ "0" : { "sold" : 1 } }
{ "0" : { } }
{ "0" : { } }

Sprint: Query 2017-10-23
Participants:

 Description   

The result of a projection using nested keys is different from 3.2 to 3.4.

We had to use `$ifnull` in some cases to avoid this difference of results.

I couldn't find any breaking change listed on 3.4 mentioning this change and I couldn't find any issue that was describing exactly this issue.

We found this after changing our CI to use 3.4

I'm sorry if I missed some documentation explaining this breaking change. Thanks for your attention!



 Comments   
Comment by Eduardo Gurgel Pinho [ 04/Oct/17 ]

Thank you very much for the extremely detailed explanation!

Comment by Charlie Swanson [ 04/Oct/17 ]

Hi edgurgel,

Thanks for your patience. This was indeed an intentional change, and we do believe the new semantics are more consistent.

In more detail:

The First Compatibility Change

{$project: {a: {b: "$missing"}}} used to be {}, now is {a: {}}

This change makes the $project stage more consistent with how we treat objects elsewhere in the aggregation language. Consider the following example, which works on both 3.2.0 and 3.4.0:

> db.foo.insert([{_id: 0, x: 1, y: 1}, {_id: 1, x: 2, y: 2}])
BulkWriteResult({
	"writeErrors" : [ ],
	"writeConcernErrors" : [ ],
	"nInserted" : 2,
	"nUpserted" : 0,
	"nMatched" : 0,
	"nModified" : 0,
	"nRemoved" : 0,
	"upserted" : [ ]
})
// The argument to the $push accumulator below is an object expression, similar to how
// 3.4 now interprets the 'a' value in {$project: {a: {b: "$b"}}}.
> db.foo.aggregate([{$group: {_id: "$x", other: {$push: {other: "$other"}}}}])
{ "_id" : 2, "other" : [ {  } ] }
{ "_id" : 1, "other" : [ {  } ] }
// The array arguments to $concatArrays below construct object literals and also result in
// an empty object when all fields are missing.
> db.foo.aggregate([{$project: {_id: 0, new: {$concatArrays: [[{object: "$literal"}], [{object: "$literal"}]]}}}])
{ "new" : [ {  }, {  } ] }
{ "new" : [ {  }, {  } ] }

This is also consistent with how the (new in 3.4) $replaceRoot stage handle object literals:

> db.foo.aggregate([{$replaceRoot: {newRoot: {x: "$missing"}}}])
{  }
{  }

The Second Compatibility Change

{'a.b': <expression>} used to be {'a.b': <result of expression>}, now is an error (field names with dots aren't really supported elsewhere)

I actually mis-remembered here. This wasn't a change in the $project behavior, it's a change in $group's behavior. It's very surprising, but due to an oddity in the old implementation of $project, the following happens:

> db.foo.insert([{_id: 0, x: 1, y: 1}, {_id: 1, x: 2, y: 2}])
BulkWriteResult({
	"writeErrors" : [ ],
	"writeConcernErrors" : [ ],
	"nInserted" : 2,
	"nUpserted" : 0,
	"nMatched" : 0,
	"nModified" : 0,
	"nRemoved" : 0,
	"upserted" : [ ]
})
// The 'x.y' should not be allowed as a field name below:
> db.foo.aggregate([{$group: {_id: {"x.y": "$x"}, other: {$push: {other: "$other"}}}}])
{ "_id" : { "x.y" : 2 }, "other" : [ {  } ] }
{ "_id" : { "x.y" : 1 }, "other" : [ {  } ] }

This was previously enforced in projection, and elsewhere when object literals were used:

// "other.z" is illegal below:
> db.foo.aggregate([{$group: {_id: "$x", other: {$push: {"other.z": "$other"}}}}])
assert: command failed: {
	"ok" : 0,
	"errmsg" : "dotted field names are only allowed at the top level",
	"code" : 16405
} : aggregate failed
_getErrorWithCode@src/mongo/shell/utils.js:25:13
doassert@src/mongo/shell/assert.js:13:14
assert.commandWorked@src/mongo/shell/assert.js:244:5
DBCollection.prototype.aggregate@src/mongo/shell/collection.js:1297:5
@(shell):1:1

The Third Compatibility Change

_id will always show up in explain output for inclusion projections

If I remember correctly, this change was made because:

  • It was easier to implement that way, we didn't have to remember whether the '_id' specification was filled in by the user, or defaulted by the parsing code.
  • It is actually useful in explain output, to see that we inject an '_id: 1' if you didn't mention '_id' in your $project stage.
  • It is unlikely to break applications, as we think it is unlikely an application is depending on the output format of explain.

The Last Compatibility Change:

{a: {}, 'b.c': {}} is now an error, used to be {}

After introducing exclusion projections, this is ambiguous as to whether it means "include everything within 'a' and 'b.c'" or "exclude everything within 'a' and 'b.c'". Similarly,

{$project: { }}

is ambiguous, is it 'include everything' or 'exclude everything'. It used to be 'include everything', which is surprising.

We already documented the last compatibility change, but I've filed DOCS-10869 to include a note about the others. I'll now resolve this ticket as a duplicate of the new DOCS ticket, please watch DOCS-10869 for further updates.

I apologize for the trouble this caused you!

Best,
Charlie

Comment by Eduardo Gurgel Pinho [ 27/Sep/17 ]

Thank you for the very detailed explanation!
We were unsure if it was intended or not so that's why we decided to open the issue.

Comment by Charlie Swanson [ 27/Sep/17 ]

Hi edgurgel,

This was a change introduced during the re-write of the $project stage during 3.4. In order to allow exclusion projections we essentially re-implemented that stage. I have a note written down detailing this change, but it looks like we forgot to put it in the release notes. In any case, it'll take some amount of work to decide what is 'correct' here, and make sure it's consistent with other places in the server. I'm putting this ticket into the 'Needs Triage' queue for the query team to look at and prioritize.

The note I have about the re-write reads:

'Breaking' changes:
{a: {b: "$missing"}} used to be {}, now is {a: {}}
{'a.b': <expression>} used to be {'a.b': <result of expression>}, now is an error (field names with dots aren't really supported elsewhere)
{_id will always show up in explain output for inclusions
{a: {}, 'b.c': {}} is now an error, used to be {}

I think the next steps here are:

  1. Confirm these behavior changes are for the better.
  2. If so, file a DOCS ticket to document the breaking change(s).
  3. If not, investigate how hard it would be to fix them, and re-triage.
Generated at Thu Feb 08 04:26:32 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.