Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-73504

Optimizer assumes setUnion is commutative when it is not in terms of type

    • Type: Icon: Task Task
    • Resolution: Won't Do
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Labels:
      None
    • Query Optimization

      Our optimizer assumes that $setUnion is commutative when it's actually not in certain unimportant pathological cases. For example,

      // {$setUnion: [[0], [NumberDecimal("-0")]]}
      // produces: [0]
      
      // On the other hand, {$setUnion: [[NumberDecimal("-0")], [0]]}
      // produces: [NumbeDecimal("-0")]
      

      Here is a repro that shows how the query behaves differently when optimizations are enabled vs disabled, written by @Justin Seyster.

      (function() {
      "use strict";
       
      const coll = db.set_union_optimization;
      coll.drop();
      coll.insert({a: [-0]});
       
      const pipeline = [
          {$unwind: "$a"},
          {
              $project: {
                  _id: 0,
                  union: {
                      $setUnion: [
                          [
                              NumberDecimal("-0"),
                          ],
                          ["$a"]
                      ]
                  }
              }
          },
          // This stage is unnecessary but goes to show that a failure like this is technically possible
          // even if we tune the fuzzer so that NumberDecimal("-0") === 0.
          //{$addFields: {type: {$type: {$arrayElemAt: ["$union", 0]}}}}
      ];
       
      const resultWithOptimizations = coll.aggregate(pipeline).toArray();
       
      assert.commandWorked(
          db.adminCommand({configureFailPoint: "disableMatchExpressionOptimization", mode: "alwaysOn"}));
      assert.commandWorked(
          db.adminCommand({configureFailPoint: "disablePipelineOptimization", mode: "alwaysOn"}));
      coll.getPlanCache().clear();
       
      const resultWithoutOptimizations = coll.aggregate(pipeline).toArray();
      assert.sameMembers(resultWithOptimizations, resultWithoutOptimizations);
      }());
      

      The task of this ticket is to decide whether this behavior merits any server changes, or whether we should ignore it and continue to treat the expression as commutative.

            Assignee:
            backlog-query-optimization [DO NOT USE] Backlog - Query Optimization
            Reporter:
            ian.boros@mongodb.com Ian Boros
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: