Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-85337

$bucketAuto $first does not obey input order

    XMLWordPrintableJSON

Details

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major - P3 Major - P3
    • None
    • None
    • None
    • Query Optimization
    • ALL
    • QO 2024-02-05, QO 2024-02-19

    Description

      If I sort by {a: -1} and use $first, I should get the largest 'a' value within each group.

      For example, using $bucket I get the expected result:

      > db.c.find()
      { "_id" : 0, "a" : 0 }
      { "_id" : 1, "a" : 1 }
      { "_id" : 2, "a" : 2 }
      { "_id" : 3, "a" : 3 }
      { "_id" : 4, "a" : 4 }
      { "_id" : 5, "a" : 5 }
      { "_id" : 6, "a" : 6 }
      { "_id" : 7, "a" : 7 }
      { "_id" : 8, "a" : 8 }
      { "_id" : 9, "a" : 9 }
       
      > db.c.aggregate([ {$sort: {a: -1}}, {$bucket: {groupBy: "$a", boundaries: [0, 5, 10], output: {a: {$first: "$a"}} }} ])
      { "_id" : 0, "a" : 4 }
      { "_id" : 5, "a" : 9 }
      

      While the same thing with $bucketAuto gives the smallest value within each group:

      > db.c.aggregate([ {$sort: {a: -1}}, {$bucketAuto: {groupBy: "$a", buckets: 2, output: {a: {$first: "$a"}} }} ])
      { "_id" : { "min" : 0, "max" : 5 }, "a" : 0 }
      { "_id" : { "min" : 5, "max" : 9 }, "a" : 5 }
      

      It looks like internally $bucketAuto is doing something like:

      • stable-sort by the group key
      • scan the sorted input
        • accumulate each document into the current bucket
        • start a new bucket every N documents (based on input size / num buckets)

      But this doesn’t work for order-sensitive accumulators like $first, because the sort loses some information about the original order--even using a stable sort. The documents within each dynamically-chosen bucket are sorted by (group key, input position), instead of by input position.

      Attachments

        Activity

          People

            chii.huang@mongodb.com Chi-I Huang
            david.percy@mongodb.com David Percy
            Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

            Dates

              Created:
              Updated: