[DOCS-9023] Avoid treating arrays as literals in the aggregation pipeline Created: 28/Sep/16  Updated: 30/Oct/23

Status: Closed
Project: Documentation
Component/s: Server
Affects Version/s: None
Fix Version/s: Server_Docs_20231030

Type: Task Priority: Major - P3
Reporter: Emily Hall Assignee: Kay Kim (Inactive)
Resolution: Won't Do Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Documented
documents SERVER-8141 Avoid treating arrays as literals in ... Closed
Participants:
Days since reply: 1 year, 14 weeks, 2 days ago
Epic Link: 3.4 Agg/Arrays

 Description   

In the aggregation framework, arrays are interpreted as literals, which is not the intuitive behavior. Take the following code as an example:

> db.foo.drop()
true
> db.foo.insert({_id: 0, a: ['foo', 'bar', 'baz'], b: 'bar', c: 'Baz'})
WriteResult({ "nInserted" : 1 })
> db.foo.aggregate([{$project: {intersection: {$setIntersection: ['$a', ['$b', {$toLower: '$c'}]]}}}])
{ "_id" : 0, "intersection" : [ ] }  // Intuitively would expect [ 'bar', 'baz' ]
> db.foo.insert({_id: 1, a: ['foo', '$b'], b: 'bar', c: 'Baz'})
WriteResult({ "nInserted" : 1 })
> db.foo.aggregate([{$project: {intersection: {$setIntersection: ['$a', ['$b', {$toLower: '$c'}]]}}}])
{ "_id" : 0, "intersection" : [ ] }
{ "_id" : 1, "intersection" : [ "$b" ] }  // Instead, it is matching against literals '$b' and {$toLower: '$c'}

Instead of evaluating '$b' to be the value of the field 'b' in the current document, as is done with '$a', '$b' is treated as a literal when it is parsed inside of an array. Because arrays are treated as literals, it also prevents a projection stage from creating arrays of fields:

> db.bar.drop()
true
> db.bar.insert({_id: 0, point: {x: 10, y: 20}})
WriteResult({ "nInserted" : 1 })
> db.bar.aggregate([{$project: {coords: ['$point.x', '$point.y']}}])
assert: command failed: {
	"errmsg" : "exception: disallowed field type Array in object expression (at 'coords')",
	"code" : 15992,
	"ok" : 0
} : aggregate failed

The aggregation framework should parse arrays in the same ways it parses expressions elsewhere in the aggregation framework. This would yield the following results from above:

// With documents from collection 'foo' from above.
> db.foo.aggregate([{$project: {intersection: {$setIntersection: ['$a', ['$b', {$toLower: '$c'}]]}}}])
{ "_id" : 0, "intersection" : [ "baz", "bar" ] }
{ "_id" : 1, "intersection" : [ ] }

// With document from collection 'bar' from above.
> db.bar.aggregate([{$project: {coords: ['$point.x', '$point.y']}}])
{ "_id" : 0, "coords" : [ 10, 20 ] }

Note this is a backwards breaking change.
=======================================
Original description:
Builds an array.

input:
 {a:1, b:2, c:3}
 
operation:
 {$project: {array: {$array: ['$a', '$c', {$add:['$b', '$c'] }] } } }
 
output:
 {array: [1, 3, 5]}

An issue is how to handle missing fields (eg {$push:['$d']}). When building an object we would omit that field from the output. This would be expected if the array is a list or set, but could be a problem if indexed access is important. Possible solutions:

  1. Omit the field, shrinking the array.
  2. Replace the field with null, keeping the array the same size, but creating a value and differing from object behavior.
  3. Error out.
  4. Separate operators like $array when indexes are important and $list for when they aren't.


 Comments   
Comment by Education Bot [ 31/Oct/22 ]

Hello! This ticket has been closed due to inactivity. If you believe this ticket is still important, please reopen it and leave a comment to explain why. Thank you!

Generated at Thu Feb 08 07:57:26 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.