Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-58784

Do the $setUnion, $setDifference, and $setIntersection operations have any guarantees around order?

    • Type: Icon: Question Question
    • Resolution: Works as Designed
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • QE 2021-09-06
    • 135

      The names imply that these are set operations, which means that the order of the output array can be arbitrary. However, the implementation in the Expression class will always produce particular output orders. In particular, the output order for $setDifference is given by the order of the input array:

      MongoDB Enterprise > db.c.drop()
      MongoDB Enterprise > db.c.insert({lhs: [5, 4, 3, 2, 1], rhs: [4, 2]})
      
      // This will always produce 5, 3, 1 due to the ordering of "$lhs".
      MongoDB Enterprise > db.c.aggregate([{$project: {out: {$setDifference: ["$lhs", "$rhs"]}}}])
      { "_id" : ObjectId("60f9cc394f0e6d565685249e"), "out" : [ 5, 3, 1 ] }
      

      The $setUnion expression always outputs a sorted array:

      MongoDB Enterprise > db.c.drop()
      MongoDB Enterprise > db.c.insert({lhs: [8, 2, 6], rhs: [4, 2, 1, 7]})
      MongoDB Enterprise > db.c.aggregate([{$project: {out: {$setUnion: ["$lhs", "$rhs"]}}}])
      { "_id" : ObjectId("60f9cca54f0e6d565685249f"), "out" : [ 1, 2, 4, 6, 7, 8 ] }
      

      And the $setIntersection expression also always returns a sorted array:

      MongoDB Enterprise > db.c.drop()
      MongoDB Enterprise > db.c.insert({lhs: [8, 2, 6, 1], rhs: [4, 2, 1, 7, 6]})
      WriteResult({ "nInserted" : 1 })
      MongoDB Enterprise > db.c.aggregate([{$project: {out: {$setIntersection: ["$lhs", "$rhs"]}}}])
      { "_id" : ObjectId("60f9cd204f0e6d56568524a0"), "out" : [ 1, 2, 6 ] }
      

      SBE's implementation, on the other hand, provides no such guarantees. It uses an unordered flat hash set for the output array under the hood, which means the output array has an arbitrary order. Is this acceptable given the lack of ordering guarantees for the set expressions, or should we change the implementation in SBE to produce arrays ordered in the same way as the classic engine would?

            Assignee:
            jennifer.peshansky@mongodb.com Jennifer Peshansky (Inactive)
            Reporter:
            david.storch@mongodb.com David Storch
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

              Created:
              Updated:
              Resolved: