[SERVER-42105] Several aggregation pipeline operators are not commutative when inputs are null and invalid Created: 08/Jul/19 Updated: 06/Dec/22 |
|
| Status: | Backlog |
| Project: | Core Server |
| Component/s: | Aggregation Framework |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | George Wangensteen | Assignee: | Backlog - Query Optimization |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | query-44-grooming | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Assigned Teams: |
Query Optimization
|
| Operating System: | ALL |
| Steps To Reproduce: | db.example.aggregate( [ { $project: {A: 1, acc : { $setIntersection: [null, "string"]} } db.example.aggregate( [ { $project: {A: 1, acc : { $setIntersection: ["string", null]} } The same behavior holds if $setIntersection is replaced with $add, $multiply, etc. |
| Participants: |
| Description |
|
While working on SERVER-41992, which found that $setIntersection was not commutative on null and empty inputs, it became apparent that $setIntersection was also not commutative on null and invalid inputs. If a null input came first, the operation would return null but exit cleanly; if an invalid input came first, the operation would throw. We also noticed that $setIntersection is far from the only agg operator with this problem: see for example $add or $multiply This seemed problematic because it means that, in an operation we report is commutative, the user might have their aggregation pipeline exit cleanly, or might have it fail due to an exception, depending only on the order of arguments. Even on non-commutative operations, this behavior might be confusing to users; it masks certain instances of invalid input depending on if an argument before the invalid input is null. It also is confusing as a general inconsistency: should users expect all agg operators that are passed anything null to return null, even if other input is invalid, or should they always throw when any input is invalid? |
| Comments |
| Comment by Charlie Swanson [ 10/Jul/19 ] |
|
FYI max.hirschhorn, david.storch and claire.childs this is pretty related to our conversation about how $and and $or will push the constants to the back, so it could have some optimization fuzzer implications. I don't think it changes what we agreed to do but it's worth a look. This one was found by code inspection mostly. |