Details
-
Task
-
Resolution: Fixed
-
Major - P3
-
None
-
None
Description
Original title: [SERVER] Investigate changes in SERVER-63099: Constant folding should not assume arithmetic is associative or commutative in the general case
------
Users with queries that multiply or add many constants together along with some fieldpaths in an aggregation pipeline using the $add and $multiply expressions may find that their queries become slower after SERVER-63099 if their fieldpaths are at the beginning of the argument list. See JIRA comments on this ticket for examples.
After SERVER-63099 is merged, users will be able to speed up their queries back to pre-fix performance by moving any fieldpath references to the end of the list of arguments. For example, $multiply: [1, "$v", 2, "$w", 3, 4] can be rewritten to $multiply: [1, 2, 3, 4, "$v", "$w"] to speed the query up again.
The fact that addition and multiplication with these expressions is now defined to be left-to-right associative should be reflected in the documentation for $add and $multiply.
Description of Linked Ticket
Currently, arithmetic aggregation expressions like $multiply are marked as associative and commutative which allows the constant folding optimization to rearrange constants to combine as many of them as possible during optimization.
There are three reasons this is not a valid optimization in general:
1. For integers, this is not a safe optimization due to overflow. For example, 1073741824*-1*2 == -2147483648, but 1073741824*2*-1 overflows.
2. Floating point arithmetic is not generally associative. See this paper, specifically:
Due to roundoff errors, the associative laws of algebra do not necessarily hold for floating-point numbers. For example, the expression (x+y)z has a totally different answer than x(y+z) when x = 1030, y = -1030 and z = 1 (it is 1 in the former case, 0 in the latter). The importance of preserving parentheses cannot be overemphasized.
3. Type coercions occur when performing an operation on multiple types. For example, 1 + 0.5 + 2 will convert 1 from an integer to a float before adding it to 0.5. These type coercions are dependent on the order of operations and could change the result. See the "steps to reproduce" section for an example with `NumberDecimal`.
Because we can't know if a fieldpath will resolve to a specific number type or a value within a specific range, we cannot rule out any kind of overflow or type coercion.
So far I have only looked in to ExpressionMultiply, but ExpressionAdd and other arithmetic operations should be investigated to make sure they respect type coercions and overflow when constant folding.
Attachments
Issue Links
- documents
-
SERVER-63099 Constant folding should not assume arithmetic is associative or commutative in the general case
-
- Closed
-