Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-21654

aggregation pipeline not correctly on $stdDevPop and $stdDevSample not returning values correctly

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Works as Designed
    • Affects Version/s: 3.2.0-rc2
    • Fix Version/s: None
    • Component/s: Aggregation Framework
    • Labels:
      None
    • Operating System:
      ALL
    • Steps To Reproduce:
      Hide
      • import ages.json file
      • run the following instructions on shell

        var pipeline = [
          {$group: {_id:"$city", ages: {$push: "$age"}}},
          {$group: {_id:"$_id", average:{$avg: "$ages"}, stdSamp: {$stdDevSamp: "$ages"},  stdPop: {$stdDevPop: "$ages"}}},
        ];
        var res = db.ages.aggregate(pipeline);
        printjson(res);
        

      Show
      import ages.json file run the following instructions on shell var pipeline = [ {$group: {_id: "$city" , ages: {$push: "$age" }}}, {$group: {_id: "$_id" , average:{$avg: "$ages" }, stdSamp: {$stdDevSamp: "$ages" }, stdPop: {$stdDevPop: "$ages" }}}, ]; var res = db.ages.aggregate(pipeline); printjson(res);

      Description

      Running an aggregation pipeline composed by the following $stdDevPop and $stdDevSamp does not produce the expected output.
      If we have a collection where documents contain individual "age" field and we want to calculate the standard deviation ($stdDevPop) for a given selector (city) the pipeline should be the following:

      db.ages.aggregate([ {$group: { ages: {$push:"$age"}, _id:"$city"  }}, {$group: { _id: null,  std: {$avg: "$ages"}  }}   ] )
      

      but the output I'm currently getting is the following:

      {
        "waitedMS": NumberLong("0"),
        "result": [
          {
            "_id": null,
            "std": null
          }
        ],
        "ok": 1
      }
      

      Running the same operation but using an intermidiate collection does perform the expected result:

      db.ages.aggregate([ {$group: { ages: {$push:"$age"}, _id:"$city",  }}, {$out:'aaa'}] )
      db.aaa.aggregate( {$project: {_id: "$_id", s: { $stdDevPop: "$ages"}}})
      {
        "waitedMS": NumberLong("0"),
        "result": [
          {
            "_id": "NYC",
            "s": 29.037802755871343
          }
        ],
        "ok": 1
      }
      

      Also tested with $avg with the same issue.

      Seems like there is some issue with pipelined $group operations

        Attachments

          Activity

            People

            Assignee:
            kelsey.schubert Kelsey T Schubert
            Reporter:
            norberto.leite Norberto Fernando Rocha Leite (Inactive)
            Participants:
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: