[SERVER-8141] Avoid treating arrays as literals in the aggregation pipeline Created: 10/Jan/13  Updated: 28/Sep/16  Resolved: 15/Jul/15

Status: Closed
Project: Core Server
Component/s: Aggregation Framework
Affects Version/s: None
Fix Version/s: 3.1.6

Type: New Feature Priority: Major - P3
Reporter: Mathias Stearn Assignee: Charlie Swanson
Resolution: Done Votes: 14
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Documented
is documented by DOCS-9023 Avoid treating arrays as literals in ... Closed
Related
related to CSHARP-1365 Avoid treating arrays as literals in ... Closed
related to DRIVERS-234 Aggregation Builder Support for 3.2 Closed
Backwards Compatibility: Major Change
Sprint: Quint Iteration 4, Quint Iteration 5, Quint Iteration 6
Participants:

 Description   

In the aggregation framework, arrays are interpreted as literals, which is not the intuitive behavior. Take the following code as an example:

> db.foo.drop()
true
> db.foo.insert({_id: 0, a: ['foo', 'bar', 'baz'], b: 'bar', c: 'Baz'})
WriteResult({ "nInserted" : 1 })
> db.foo.aggregate([{$project: {intersection: {$setIntersection: ['$a', ['$b', {$toLower: '$c'}]]}}}])
{ "_id" : 0, "intersection" : [ ] }  // Intuitively would expect [ 'bar', 'baz' ]
> db.foo.insert({_id: 1, a: ['foo', '$b'], b: 'bar', c: 'Baz'})
WriteResult({ "nInserted" : 1 })
> db.foo.aggregate([{$project: {intersection: {$setIntersection: ['$a', ['$b', {$toLower: '$c'}]]}}}])
{ "_id" : 0, "intersection" : [ ] }
{ "_id" : 1, "intersection" : [ "$b" ] }  // Instead, it is matching against literals '$b' and {$toLower: '$c'}

Instead of evaluating '$b' to be the value of the field 'b' in the current document, as is done with '$a', '$b' is treated as a literal when it is parsed inside of an array. Because arrays are treated as literals, it also prevents a projection stage from creating arrays of fields:

> db.bar.drop()
true
> db.bar.insert({_id: 0, point: {x: 10, y: 20}})
WriteResult({ "nInserted" : 1 })
> db.bar.aggregate([{$project: {coords: ['$point.x', '$point.y']}}])
assert: command failed: {
	"errmsg" : "exception: disallowed field type Array in object expression (at 'coords')",
	"code" : 15992,
	"ok" : 0
} : aggregate failed

The aggregation framework should parse arrays in the same ways it parses expressions elsewhere in the aggregation framework. This would yield the following results from above:

// With documents from collection 'foo' from above.
> db.foo.aggregate([{$project: {intersection: {$setIntersection: ['$a', ['$b', {$toLower: '$c'}]]}}}])
{ "_id" : 0, "intersection" : [ "baz", "bar" ] }
{ "_id" : 1, "intersection" : [ ] }

// With document from collection 'bar' from above.
> db.bar.aggregate([{$project: {coords: ['$point.x', '$point.y']}}])
{ "_id" : 0, "coords" : [ 10, 20 ] }

Note this is a backwards breaking change.
=======================================
Original description:
Builds an array.

input:
 {a:1, b:2, c:3}
 
operation:
 {$project: {array: {$array: ['$a', '$c', {$add:['$b', '$c'] }] } } }
 
output:
 {array: [1, 3, 5]}

An issue is how to handle missing fields (eg {$push:['$d']}). When building an object we would omit that field from the output. This would be expected if the array is a list or set, but could be a problem if indexed access is important. Possible solutions:

  1. Omit the field, shrinking the array.
  2. Replace the field with null, keeping the array the same size, but creating a value and differing from object behavior.
  3. Error out.
  4. Separate operators like $array when indexes are important and $list for when they aren't.


 Comments   
Comment by Charlie Swanson [ 15/Jul/15 ]

I have updated the description to reflect how we addressed this problem. Instead of introducing a new $array operator, we changed the parsing of arrays to avoid treating all elements as constants.

Comment by Githook User [ 15/Jul/15 ]

Author:

{u'username': u'cswanson310', u'name': u'Charlie Swanson', u'email': u'charlie.swanson@mongodb.com'}

Message: SERVER-8141 Avoid treating arrays as literals in aggregation pipeline
Branch: master
https://github.com/mongodb/mongo/commit/c6e7a0874e5fc2767a5d50f47d0441703fea73ac

Comment by Kevin Davenport [X] [ 08/Jan/14 ]

This would be most excellent.

Generated at Thu Feb 08 03:16:40 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.