Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-4589

aggregation: need an array indexing operator

    Details

    • Type: New Feature
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.1.5
    • Component/s: Aggregation Framework
    • Labels:
      None
    • Backwards Compatibility:
      Minor Change
    • Sprint:
      Quint Iteration 4, Quint Iteration 5

      Description

      At present, there's no way to extract a specific element from an array-valued field. An operator should be added that could be used in a projection, e.g.:

      db.runCommand({aggregate:"c", pipeline:[
      { $project : {
      foo :

      { $at:["$arrayField", 5] }

      }}
      ]});

      The effect of this would be put add the computed field (virtual field) foo to the projection, and give it the value of arrayField[5].

        Issue Links

          Activity

          Hide
          scotthernandez Scott Hernandez (Inactive) added a comment -

          Should this support something like $slice as well?

          Show
          scotthernandez Scott Hernandez (Inactive) added a comment - Should this support something like $slice as well?
          Hide
          cwestin Chris Westin added a comment -

          $slice ounds like a good idea. Going after a single element is a special case of $slice, so it seems like we should just implement that, and support functional use ($slice) as well as array element syntax to call the same thing.

          Show
          cwestin Chris Westin added a comment - $slice ounds like a good idea. Going after a single element is a special case of $slice, so it seems like we should just implement that, and support functional use ($slice) as well as array element syntax to call the same thing.
          Hide
          gavin@buyspace.com Gavin Hogan added a comment -

          This would be very helpful in combination with $unwind.

          Show
          gavin@buyspace.com Gavin Hogan added a comment - This would be very helpful in combination with $unwind.
          Hide
          cacsar Christian Csar added a comment -

          I think it's important to note that for $match this is not a new feature, rather it appears to be a bug in a currently existing feature (and arguably for $project as well). See the documented syntax at http://www.mongodb.org/display/DOCS/Dot+Notation+%28Reaching+into+Objects%29#DotNotation%28ReachingintoObjects%29-ArrayElementbyPosition and the language for $match which says "The syntax passed to the $match is identical to the query syntax." See https://groups.google.com/d/topic/mongodb-user/7o5frOLbPQ8/discussion for examples showing limitations of $match vs query in regard to this.

          At present this makes it seemingly impossible to make use of the individual values in geospatial fields. I'd appreciate it if this was reclassified as a bug.

          Show
          cacsar Christian Csar added a comment - I think it's important to note that for $match this is not a new feature, rather it appears to be a bug in a currently existing feature (and arguably for $project as well). See the documented syntax at http://www.mongodb.org/display/DOCS/Dot+Notation+%28Reaching+into+Objects%29#DotNotation%28ReachingintoObjects%29-ArrayElementbyPosition and the language for $match which says "The syntax passed to the $match is identical to the query syntax." See https://groups.google.com/d/topic/mongodb-user/7o5frOLbPQ8/discussion for examples showing limitations of $match vs query in regard to this. At present this makes it seemingly impossible to make use of the individual values in geospatial fields. I'd appreciate it if this was reclassified as a bug.
          Hide
          asya Asya Kamsky added a comment -

          Note: $match does allow referencing positions of array, at least in the latest master:

          > db.array.aggregate({$match:{"a.0":1}})
          { "_id" : ObjectId("5282869acefcd1a08068e472"), "a" : [  1,  2,  3 ] }
          { "_id" : ObjectId("5282869fcefcd1a08068e473"), "a" : [  1 ] }
          Fetched 2 record(s) in 0ms

          This allows matching only documents where a is array or a is non-array, but the same doesn't work in {{ $project {$cond} }} for example.

          Show
          asya Asya Kamsky added a comment - Note: $match does allow referencing positions of array, at least in the latest master: > db.array.aggregate({$match:{"a.0":1}}) { "_id" : ObjectId("5282869acefcd1a08068e472"), "a" : [ 1, 2, 3 ] } { "_id" : ObjectId("5282869fcefcd1a08068e473"), "a" : [ 1 ] } Fetched 2 record(s) in 0ms This allows matching only documents where a is array or a is non-array, but the same doesn't work in {{ $project {$cond} }} for example.
          Hide
          yourdeveloperfriend Nathan Tate added a comment - - edited

          I'm running into a similar issue where I want to pluck the nth item from an array in each matching document:

          db.users.aggregate({
            $group: {
              _id: null,
              value: {$addToSet: '$values.4'}
            }
          });

          I can unwind the values, then extract the first or the last, but getting one of the values at a specific index seems impossible.

          Show
          yourdeveloperfriend Nathan Tate added a comment - - edited I'm running into a similar issue where I want to pluck the nth item from an array in each matching document: db.users.aggregate({ $group: { _id: null , value: {$addToSet: '$values.4' } } }); I can unwind the values, then extract the first or the last, but getting one of the values at a specific index seems impossible.
          Hide
          cacsar Christian Csar added a comment -

          Having been reminded of this once more again in regards to Geo, it appears that one of the links above is now dead.
          http://docs.mongodb.org/manual/tutorial/query-documents/#match-a-specific-element-of-an-array and http://docs.mongodb.org/manual/reference/operator/aggregation/match/#match-aggregation are likely now the best documentation that this is a bug whether a doc bug or a code bug rather than a feature request, though perhaps it was fixed for $match as commented above (I have not confirmed, if so a comment would be in order)?

          With regards to $project, http://docs.mongodb.org/manual/reference/operator/aggregation/project/#include-specific-fields-from-embedded-documents specifies that Dot Notation can be used, and http://docs.mongodb.org/manual/core/document/#dot-notation does still specify that it applies to specific elements of arrays. Again I have not confirmed whether this limitation still exists, but I would have expected some additional comment on this.

          Show
          cacsar Christian Csar added a comment - Having been reminded of this once more again in regards to Geo, it appears that one of the links above is now dead. http://docs.mongodb.org/manual/tutorial/query-documents/#match-a-specific-element-of-an-array and http://docs.mongodb.org/manual/reference/operator/aggregation/match/#match-aggregation are likely now the best documentation that this is a bug whether a doc bug or a code bug rather than a feature request, though perhaps it was fixed for $match as commented above (I have not confirmed, if so a comment would be in order)? With regards to $project, http://docs.mongodb.org/manual/reference/operator/aggregation/project/#include-specific-fields-from-embedded-documents specifies that Dot Notation can be used, and http://docs.mongodb.org/manual/core/document/#dot-notation does still specify that it applies to specific elements of arrays. Again I have not confirmed whether this limitation still exists, but I would have expected some additional comment on this.
          Hide
          roland.bouman Roland Bouman added a comment -

          I completely agree with mr Christian Csar. This is a bug. The dot notation does work for subdocuments in the $project stage. Why would it conceptually be different for arrays? That simply doesn't make sense.

          Mean while, this is preventing me from flattening specific array elements to properties. I'm arriving at the conclusion that this is currently simply not possible in mongodb - (at least if we discount map/reduce).

          Show
          roland.bouman Roland Bouman added a comment - I completely agree with mr Christian Csar. This is a bug. The dot notation does work for subdocuments in the $project stage. Why would it conceptually be different for arrays? That simply doesn't make sense. Mean while, this is preventing me from flattening specific array elements to properties. I'm arriving at the conclusion that this is currently simply not possible in mongodb - (at least if we discount map/reduce).
          Hide
          roland.bouman Roland Bouman added a comment - - edited

          I'd like to add that I do not see any reason for introducing another operator or to be forced to use $slice to do this. It would be nice if $slice was supported, but this is another matter - this has simply to do with the dot notation not working as advertised. Besides, $slice still returns an array - The problem here is that a dot notation expression should simply extract the value from the array (having whichever type that value has).

          Consider the following query:

          db.zips.aggregate([{$match: {"loc.0": -72.188455}}, {$project: {_id:1, loc: 1, lon: {$let: { vars:

          {lon: "$loc.0"}

          , in: "$$lon"}} }}]);

          this returns:

          { "_id" : "01010", "loc" : [ -72.188455, 42.116543 ], "lon" : [ ] }

          This is wrong. Expected was:

          { "_id" : "01010", "loc" : [ -72.188455, 42.116543 ], "lon" : -72.188455 }

          Notice that
          1) the expression "loc.0" is valid and can be successfully filtered to -72.188455 in the $match stage. This indicates the expression itself is valid
          2) lon is returned as an empty array. This is wrong for two reasons: a) we're expecting the scalar (-72.188455), b) the array is empty.

          The simpler query without $let, should, as far as I can see also work:

          db.zips.aggregate([{$match: {"loc.0": -72.188455}}, {$project: {_id:1, loc: 1, lon: "$loc.0" }}]);

          ...but yields the same result - empty array. This already indicates that mongodb knows it is accessing an array so at least it partially understands that "loc" is being accessed (it is the only array in the document)

          Show
          roland.bouman Roland Bouman added a comment - - edited I'd like to add that I do not see any reason for introducing another operator or to be forced to use $slice to do this. It would be nice if $slice was supported, but this is another matter - this has simply to do with the dot notation not working as advertised. Besides, $slice still returns an array - The problem here is that a dot notation expression should simply extract the value from the array (having whichever type that value has). Consider the following query: db.zips.aggregate([{$match: {"loc.0": -72.188455}}, {$project: {_id:1, loc: 1, lon: {$let: { vars: {lon: "$loc.0"} , in: "$$lon"}} }}]); this returns: { "_id" : "01010", "loc" : [ -72.188455, 42.116543 ], "lon" : [ ] } This is wrong. Expected was: { "_id" : "01010", "loc" : [ -72.188455, 42.116543 ], "lon" : -72.188455 } Notice that 1) the expression "loc.0" is valid and can be successfully filtered to -72.188455 in the $match stage. This indicates the expression itself is valid 2) lon is returned as an empty array. This is wrong for two reasons: a) we're expecting the scalar (-72.188455), b) the array is empty. The simpler query without $let, should, as far as I can see also work: db.zips.aggregate( [{$match: {"loc.0": -72.188455}}, {$project: {_id:1, loc: 1, lon: "$loc.0" }}] ); ...but yields the same result - empty array. This already indicates that mongodb knows it is accessing an array so at least it partially understands that "loc" is being accessed (it is the only array in the document)
          Hide
          asya Asya Kamsky added a comment -

          This would return a scalar (or value of type of array element, rather than array of a single element which is what slice would return.

          Show
          asya Asya Kamsky added a comment - This would return a scalar (or value of type of array element, rather than array of a single element which is what slice would return.
          Hide
          xgen-internal-githook Githook User added a comment -

          Author:

          {u'username': u'cswanson310', u'name': u'Charlie Swanson', u'email': u'charlie.swanson@mongodb.com'}

          Message: SERVER-4589: Add $arrayElemAt aggregation expression
          Branch: master
          https://github.com/mongodb/mongo/commit/c317c2b9416ffeb9a79ac8bec1eb79ae10448a4a

          Show
          xgen-internal-githook Githook User added a comment - Author: {u'username': u'cswanson310', u'name': u'Charlie Swanson', u'email': u'charlie.swanson@mongodb.com'} Message: SERVER-4589 : Add $arrayElemAt aggregation expression Branch: master https://github.com/mongodb/mongo/commit/c317c2b9416ffeb9a79ac8bec1eb79ae10448a4a
          Hide
          charlie.swanson Charlie Swanson added a comment -

          The new expression is called $arrayElemAt. It takes two arguments, an array and an index. As is hopefully self explanatory, it returns the element at the given index in the array. Negative indices are accepted as indexes from the back of the array. If the index is out of bounds, it returns the missing value, meaning the field will not exist in the output.

          If either argument is null, it returns null, otherwise it errors if the first argument is not an array, or the second argument cannot be represented as an integer value (2, NuberLong(2), and 2.0 work, 2.5 and NumberLong(2^40) do not).

          For all the edge case details, see the javascript test which was part of the commit.

          Show
          charlie.swanson Charlie Swanson added a comment - The new expression is called $arrayElemAt . It takes two arguments, an array and an index. As is hopefully self explanatory, it returns the element at the given index in the array. Negative indices are accepted as indexes from the back of the array. If the index is out of bounds, it returns the missing value, meaning the field will not exist in the output. If either argument is null, it returns null, otherwise it errors if the first argument is not an array, or the second argument cannot be represented as an integer value (2, NuberLong(2), and 2.0 work, 2.5 and NumberLong(2^40) do not). For all the edge case details, see the javascript test which was part of the commit.
          Show
          oleg@evergage.com Oleg Rekutin added a comment - For anyone interested, backports of this useful feature: 3.0: https://github.com/evergage/mongo/commit/cde793ed9c85b9356e3c20a6bc09f536cc35f245 2.6: https://github.com/evergage/mongo/commit/49dc3c9fc5c723034dfd88d5a2a97010a9172984

            People

            • Votes:
              75 Vote for this issue
              Watchers:
              48 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                  Agile