[SERVER-66248] Projection with ExpressionObject does not round trip (serialize + parse) Created: 05/May/22 Updated: 29/Oct/23 Resolved: 14/Sep/23 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 7.2.0-rc0 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | David Percy | Assignee: | Matt Olma |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | query-director-triage, query-product-urgency-2, query-product-value-1 | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Assigned Teams: |
Query Optimization
|
| Backwards Compatibility: | Fully Compatible |
| Operating System: | ALL |
| Sprint: | QO 2023-08-07, QO 2023-08-21, QO 2023-09-04, QO 2023-09-18 |
| Participants: |
| Description |
|
When we serialize a projection, as in {$set: {a: _}}, we need to be careful when the expression happens to be an object-expression, such as {b: "$_id"}. Currently, serialize() incorrectly produces {$set: {a: {b: "$_id"}}} which has a different meaning. For example, set up a collection like this:
Run this query:
On a non-sharded collection, this correctly returns docs where the entire field 'a' is replaced with a new document {b: 123}:
However, if you set up the same example on a sharded collection:
And you run the same query, you get a wrong answer:
Instead of replacing all of 'a', we have added a field 'b' to every array element, and preserved other subfields. This happens because internally, after optimization, we end up with an AST that we cannot correctly serialize. We have something like (ProjectionNode "a" (ExpressionObject "b" ...)) which we serialize as {$set: {a: {b: _}}}, as is then misinterpreted as (ProjectionNode "a" (ProjectionNode "b" ...)). |
| Comments |
| Comment by Githook User [ 13/Sep/23 ] | |
|
Author: {'name': 'Matt Olma', 'email': 'matt.olma@mongodb.com', 'username': 'mattsimply'}Message: | |
| Comment by David Percy [ 19/Jul/23 ] | |
|
At a high level there are two separate changes we would make:
In more detail:
In total I think the amount of work is comparable to adding a new expression. The code change is smaller than that, but the testing is larger. It might make sense to split into two tickets, and then each one might take a few days. | |
| Comment by Steve Tarzia [ 19/Jul/23 ] | |
|
david.percy@mongodb.com can you please do some more investigation here and come up with a plan and estimate? | |
| Comment by David Percy [ 06/May/22 ] | |
|
One way we can solve this is to allow the query to use a keyword ($expr ?) to mark where the projection ends and the expression begins.
This wouldn't need to be a special case in $set / $project / $addFields: we could register a $expr as an expression parser. It wouldn't need a new subclass of Expression either: the parser function could just parse its argument and return that. |