Details
-
Task
-
Resolution: Won't Do
-
Major - P3
-
None
-
None
Description
In the aggregation framework, arrays are interpreted as literals, which is not the intuitive behavior. Take the following code as an example:
> db.foo.drop()
|
true
|
> db.foo.insert({_id: 0, a: ['foo', 'bar', 'baz'], b: 'bar', c: 'Baz'}) |
WriteResult({ "nInserted" : 1 }) |
> db.foo.aggregate([{$project: {intersection: {$setIntersection: ['$a', ['$b', {$toLower: '$c'}]]}}}]) |
{ "_id" : 0, "intersection" : [ ] } // Intuitively would expect [ 'bar', 'baz' ] |
> db.foo.insert({_id: 1, a: ['foo', '$b'], b: 'bar', c: 'Baz'}) |
WriteResult({ "nInserted" : 1 }) |
> db.foo.aggregate([{$project: {intersection: {$setIntersection: ['$a', ['$b', {$toLower: '$c'}]]}}}]) |
{ "_id" : 0, "intersection" : [ ] } |
{ "_id" : 1, "intersection" : [ "$b" ] } // Instead, it is matching against literals '$b' and {$toLower: '$c'} |
Instead of evaluating '$b' to be the value of the field 'b' in the current document, as is done with '$a', '$b' is treated as a literal when it is parsed inside of an array. Because arrays are treated as literals, it also prevents a projection stage from creating arrays of fields:
> db.bar.drop()
|
true
|
> db.bar.insert({_id: 0, point: {x: 10, y: 20}})
|
WriteResult({ "nInserted" : 1 }) |
> db.bar.aggregate([{$project: {coords: ['$point.x', '$point.y']}}]) |
assert: command failed: {
|
"errmsg" : "exception: disallowed field type Array in object expression (at 'coords')", |
"code" : 15992, |
"ok" : 0 |
} : aggregate failed
|
The aggregation framework should parse arrays in the same ways it parses expressions elsewhere in the aggregation framework. This would yield the following results from above:
// With documents from collection 'foo' from above.
|
> db.foo.aggregate([{$project: {intersection: {$setIntersection: ['$a', ['$b', {$toLower: '$c'}]]}}}]) |
{ "_id" : 0, "intersection" : [ "baz", "bar" ] } |
{ "_id" : 1, "intersection" : [ ] } |
// With document from collection 'bar' from above.
|
> db.bar.aggregate([{$project: {coords: ['$point.x', '$point.y']}}]) |
{ "_id" : 0, "coords" : [ 10, 20 ] } |
Note this is a backwards breaking change.
=======================================
Original description:
Builds an array.
input:
|
{a:1, b:2, c:3}
|
|
|
operation:
|
{$project: {array: {$array: ['$a', '$c', {$add:['$b', '$c'] }] } } }
|
|
|
output:
|
{array: [1, 3, 5]}
|
An issue is how to handle missing fields (eg {$push:['$d']}). When building an object we would omit that field from the output. This would be expected if the array is a list or set, but could be a problem if indexed access is important. Possible solutions:
- Omit the field, shrinking the array.
- Replace the field with null, keeping the array the same size, but creating a value and differing from object behavior.
- Error out.
- Separate operators like $array when indexes are important and $list for when they aren't.
Attachments
Issue Links
- documents
-
SERVER-8141 Avoid treating arrays as literals in the aggregation pipeline
-
- Closed
-