[SERVER-44258] Pipeline::optimize() does not split a $match with an $or with a single child Created: 25/Oct/19  Updated: 06/Dec/22

Status: Backlog
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Nicholas Zolnierz Assignee: Backlog - Query Optimization
Resolution: Unresolved Votes: 1
Labels: afz, qopt-team
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Assigned Teams:
Query Optimization
Sprint: Query 2019-12-16, Query 2019-12-30, Query 2020-01-13
Participants:
Linked BF Score: 21

 Description   

As part of SERVER-36723, a second call to optimizePipeline() was removed as the original intent for doing this was only to absorb a $limit which is no longer necessary. However this introduced a more subtle behavior change, since our pipeline optimization code first attempts to re-order/split stages and then optimizes each individual stage. In the case where a $match has a $or with one child, the second phase will remove the $or altogether. There's a chance that the new $match without the $or can be split and moved ahead of a prior stage, which coincidentally will happen on the second call to optimizePipeline().



 Comments   
Comment by Nicholas Zolnierz [ 23/Jan/20 ]

We should consider closing the linked BFs if they can't happen anymore, since it looks like the build baron team has been incorrectly marking many multiversion failures as dups (e.g. I just re-opened BF-16031). 

Comment by David Storch [ 23/Jan/20 ]

I don't have time to pursue this ticket further, so I'm sending it to the Query Optimization team for triage. Note that the linked build failures can no longer happen due to the quick fix for this issue already merged under this ticket: https://github.com/mongodb/mongo/commit/5c4c0d968ad72f05c768d553a49ac73e0ddb41f9.

Comment by Githook User [ 21/Dec/19 ]

Author:

{'name': 'Charlie Swanson', 'email': 'charlie.swanson@mongodb.com', 'username': 'cswanson310'}

Message: SERVER-45284 Temporarily workaround SERVER-44258 to stop build failures

This is a quick and dirty fix to stop some tests from failing while we
develop a more robust fix.
Branch: master
https://github.com/mongodb/mongo/commit/5c4c0d968ad72f05c768d553a49ac73e0ddb41f9

Comment by Ted Tuckman [ 05/Dec/19 ]

Putting back on the backlog as I am out tomorrow and we would like this fixed. This hasn't been worked on in a few weeks, but the last BF Friday I was in we could only come up with the naive solution: Add an extra unconditional call to optimizePipeline in the aggregation case, but this really just undoes the work that Dave did in his patch.

Comment by Ted Tuckman [ 08/Nov/19 ]

Current state of the ticket: The problem with adding another call to optimize is that it breaks match in $lookup. $lookup removes stages if there are no variables here. If we optimize before this, it removes the match stage (and the rest of the pipeline in some tests. However, we need to have optimized before the match stage in order to split as expected.

After discussion with James today, we decided to investigate whether we could change/modify the dep tracking to work around this. This is still a WIP.

Comment by David Storch [ 29/Oct/19 ]

Would it be reasonable to fix this by having the $match splitting code call optimize() on any DocumentSourceMatch it finds before attempting the swap? I don't think we should swap the order of doing pipeline-level optimizations and stage-level optimizations, since they were designed to happen in a specific order. I'd also hesitate to make the $match splitting code aware of the single-child $or case, since that is the concern of MatchExpression::optimize().

Generated at Thu Feb 08 05:05:28 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.