[SERVER-73504] Optimizer assumes setUnion is commutative when it is not in terms of type Created: 31/Jan/23  Updated: 16/Feb/23  Resolved: 16/Feb/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Ian Boros Assignee: Backlog - Query Optimization
Resolution: Won't Do Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
Assigned Teams:
Query Optimization
Participants:

 Description   

Our optimizer assumes that $setUnion is commutative when it's actually not in certain unimportant pathological cases. For example,

// {$setUnion: [[0], [NumberDecimal("-0")]]}
// produces: [0]
 
// On the other hand, {$setUnion: [[NumberDecimal("-0")], [0]]}
// produces: [NumbeDecimal("-0")]

Here is a repro that shows how the query behaves differently when optimizations are enabled vs disabled, written by @Justin Seyster.

(function() {
"use strict";
 
const coll = db.set_union_optimization;
coll.drop();
coll.insert({a: [-0]});
 
const pipeline = [
    {$unwind: "$a"},
    {
        $project: {
            _id: 0,
            union: {
                $setUnion: [
                    [
                        NumberDecimal("-0"),
                    ],
                    ["$a"]
                ]
            }
        }
    },
    // This stage is unnecessary but goes to show that a failure like this is technically possible
    // even if we tune the fuzzer so that NumberDecimal("-0") === 0.
    //{$addFields: {type: {$type: {$arrayElemAt: ["$union", 0]}}}}
];
 
const resultWithOptimizations = coll.aggregate(pipeline).toArray();
 
assert.commandWorked(
    db.adminCommand({configureFailPoint: "disableMatchExpressionOptimization", mode: "alwaysOn"}));
assert.commandWorked(
    db.adminCommand({configureFailPoint: "disablePipelineOptimization", mode: "alwaysOn"}));
coll.getPlanCache().clear();
 
const resultWithoutOptimizations = coll.aggregate(pipeline).toArray();
assert.sameMembers(resultWithOptimizations, resultWithoutOptimizations);
}());

The task of this ticket is to decide whether this behavior merits any server changes, or whether we should ignore it and continue to treat the expression as commutative.


Generated at Thu Feb 08 06:24:51 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.