Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-31831

Improve aggregation set operations for array of objects

    • Type: Icon: Improvement Improvement
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Aggregation Framework
    • Labels:
      None
    • Query Optimization

      Currently, it is hard to compute stats / transform data on a nested array of objects where a subset of the fields make up the key (hash) of the object. It is possible to do via $unwind and $group, but that is an issue when operating on multiple fields in the documents at the same time. The other way is to $map using $concat on the key fields -> $setUnion -> $map -> $reduce using $cond with $in, but that is way too slow.

      It would be helpful if the set operations allowed specifying the comparison function. When providing a custom comparison function, the $setUnion, $setIntersection, and $setDifference could have a mandatory reduce function to merge duplicates.

      { $setEqual: {
          input: [
            [{a: 1, b: 1, c: 1}, {a: 2, b: 1, c: 3},{a: 2, b: 1, c: 4}]
            [{a: 1, b: 1, c: 2}, {a: 2, b: 1, c: 5}]
          ]
        ],
        as: ["item1", "item2"], // name the two objects being compared at a time
        cond: {
          $and: [
            {$eq: ["$$item1.a", "$$item2.a"]},
            {$eq: ["$$item1.b", "$$item2.b"]}
          ]
        }
      }
      
      Result: true
      
      { $setUnion: {
          input: [
            [{a: 1, b: 1, c: 1}, {a: 2, b: 1, c: 3}, {a: 2, b: 1, c: 4}]
            [{a: 1, b: 1, c: 2}, {a: 2, b: 1, c: 5}]
          ]
        ],
        as: ["item1", "item2"], // name the two objects being compared at a time
        cond: {
          $and: [
            {$eq: ["$$item1.a", "$$item2.a"]},
            {$eq: ["$$item1.b", "$$item2.b"]}
          ]
        },
        reduce: {
          initialValue: { c: 0 },
          in: {
            a: "$$this.a",
            b: "$$this.b",
            c: { $add: ["$$value.c", "$$this.c"]
          }
        }
      }
      
      Results: [{a: 1, b: 1, c: 3}, {a: 2: b: 1, c: 12}]
      

            Assignee:
            backlog-query-optimization [DO NOT USE] Backlog - Query Optimization
            Reporter:
            devnopt Joel Goldfinger
            Votes:
            1 Vote for this issue
            Watchers:
            8 Start watching this issue

              Created:
              Updated: