Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-9377

Allow collecting "top" N values for each group

    • Fully Compatible
    • QE 2021-10-18, QE 2021-11-29, QE 2021-12-13

      Issue Status as of Jan 6, 2022

      Summary

      • Accumulators that return an array of top/bottom n elements for each group according to the specified order
        {
           $topN: {
               n: <n expression>,
               sortBy: <sort spec>,
               output: <expression>
           }
        }
        
        {
           $bottomN: {
               n: <n expression>,
               sortBy: <sort spec>,
               output: <expression>
           }
        }
        

        To return one top/bottom element:

        {
           $top: {
               sortBy: <sort spec>,
               output: <expression>
           }
        }
        
        {
           $bottom: {
               sortBy: <sort spec>,
               output: <expression>
           }
        }
        
      • Accumulators that return first/last n elements for each group according to the current order of the documents:
        {
           $firstN: {
               n: <n expression>,
               input: <expression>
           }
        }
        
        {
           $lastN: {
               n: <n expression>,
               input: <expression>
           }
        }
        
      • Accumulators that return n highest/lowest values for each group
        {
           $maxN: {
               n: <n expression>,
               input: <expression>
           }
        }
        
        {
           $minN: {
               n: <n expression>,
               input: <expression>
           }
        }
        
      • $firstN/$lastN/$minN/$maxN are also available as aggregation expressions for array fields.

      Examples

      db.scores.insert([
       {user: "user1", game:"game_A", score: 2345},
       {user: "user2", game:"game_A", score: 100},
       {user: "user3", game:"game_A", score: 555},
       {user: "user4", game:"game_A", score: 53234},
       {user: "user1", game:"game_B", score: 65438},
       {user: "user2", game:"game_B", score: 865},
       {user: "user3", game:"game_B", score: 400},
       {user: "user4", game:"game_B", score: 9865}
      ]);
      
      db.scores.aggregate([
        {
           $group:
           {
              _id: "$game",
              leaderboard:
                 {
                    $topN:
                       {
                          sortBy: {score: -1},
                          n: 3,
                          output: {userName: '$user', score: '$score'}
                       }
                 }
           }
        }
      ])
      

      Output

      [{
       user: "game_A",
       leaderboard: [
         {userName: "user4", score: 53234},
         {userName: "user1", score: 2345},
         {userName: "user3", score: 555},
       ]
      },
      {
       user: "game_B",
       leaderboard: [
         {user: "user1", score: 65438},
         {user: "user4", score: 9865},
         {user: "user2", score: 865},
       ]
      }]
      
      db.scores.aggregate( [
        {
           $group:
           {
              _id: "$game",
              threeHighestScores:
                 {
                    $maxN:
                       {
                          input: "$score",
                          n: 3
                       }
                 }
           }
        }
      ] )
      

      Output:

      [{
       _id: "game_A",
       threeHighestScores: [ 53234, 2345, 555]
      },
      {
       _id: "game_B",
       leaderboard: [
         threeHighestScores: [ 65438, 9865, 865]
      }]
      

      Versions
      This feature will be available starting in version 5.2.0 once the upgrade (including upgrading the FCV) is fully complete.

      Original Description
      Title: Extend $push or $max to allow collecting "top" N values per _id key in $group phase

      Syntax

      {$pushFirstN: [<expression>, N]}
      {$pushLastN: [<expression>, N]}
      {$pushMinN: [<expression>, N]}
      {$pushMaxN: [<expression>, N]}
      

      Examples

      > db.coll.insert([
        {_id: "user1", game:"A", score: 2345},
        {_id: "user3", game:"A", score: 100},
        {_id: "user4", game:"A", score: 555},
        {_id: "user7", game:"A", score: 53234}
      ]);
      > db.coll.aggregate([
          {$sort: {score: -1}},
          {$group: {
              _id: "$game",
              leaderBoardUsers: {
                  $pushFirstN: ["$_id", 10]
              }
          }
      ])
      {_id: "A", leaderBoardUsers: ["user7", "user1", "user4", "user3"]}
      
      
      // Example 2
      > db.coll.insert([
        {_id: "user1", game:"A", score: 2345}
        {_id: "user3", game:"A", score: 100}
        {_id: "user4", game:"A", score: 555}
        {_id: "user7", game:"A", score: 53234}
      ]);
      > db.coll.aggregate([
          {$sort: {score: -1}},
          {$group: {
              _id: "$game",
              leaderBoard: {$pushFirstN: [{_id: "$_id", score: "$score"}, 10]}
          }
      ])
      {
        _id: "A",
        leaderBoard: [
          {_id: "user7", score: 53234},
          {_id: "user1", score: 2345},
          {_id: "user4", score: 555},
          {_id: "user3", score: 100}
        ]
      }
      

      Notes

      • This would be the first accumulator to take more than a single argument.

      Errors

      • If the second argument is not a nonnegative integer. If the argument is zero, the result will be an empty array.

      Previous Description:
      Frequently asked by users:

      Analogous to {$group:{_id:"$key", maxval:{$max:"$val"}}} if user needs to gather top N values per key (most recent, highest N, etc) to have ability to do equivalent of {$max:"$val",$limit:5} or $push:{$sort:...,$limit:N} type of idea.

            Votes:
            77 Vote for this issue
            Watchers:
            69 Start watching this issue

              Created:
              Updated:
              Resolved: