[SERVER-60373] Duplicate predicates in query plan for time-series collection Created: 01/Oct/21 Updated: 27/Oct/23 Resolved: 11/Apr/23 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor - P4 |
| Reporter: | Rui Liu | Assignee: | Naama Bareket |
| Resolution: | Gone away | Votes: | 0 |
| Labels: | greenerbuild, quick-tech-debt | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||
| Assigned Teams: |
Query Integration
|
||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||
| Steps To Reproduce: |
|
||||||||||||||||||||
| Sprint: | QO 2021-11-15, QO 2021-11-29, QO 2021-12-13, QO 2021-12-27, QO 2022-01-10, QO 2022-01-24, QO 2022-02-07, QO 2022-02-21, QO 2022-03-07, QO 2022-03-21, QO 2022-04-04, QO 2022-04-18, QO 2022-05-02, QO 2022-05-16, QO 2022-05-30, QO 2022-06-13, QO 2022-06-27, QO 2022-07-11, QO 2022-07-25, QO 2022-08-08, QO 2022-08-22, QO 2022-09-05, QO 2022-09-19, QO 2022-10-03, QE 2022-10-17 | ||||||||||||||||||||
| Participants: | |||||||||||||||||||||
| Description |
|
Querying a time-series collection produces a confusing query plan that has each $match predicate duplicated twice.
|
| Comments |
| Comment by Githook User [ 14/Apr/23 ] | ||||||||||||||||||||
|
Author: {'name': 'Naama Bareket', 'email': 'naama.bareket@mongodb.com', 'username': 'naama-bareket'}Message: | ||||||||||||||||||||
| Comment by Arun Banala [ 11/Apr/23 ] | ||||||||||||||||||||
|
It looks like rui.liu@mongodb.com change to serialize _eventFilter inside the $_internalUnpackBucket as part of | ||||||||||||||||||||
| Comment by Matt Boros [ 07/Oct/22 ] | ||||||||||||||||||||
|
Unassigning, as there seems to be a few more tests that fail and possibly other changes to make this work. Here's this latest branch: SERVER-60373. Latest patch looks like the JS test written needs to be changed. | ||||||||||||||||||||
| Comment by Matt Boros [ 22/Nov/21 ] | ||||||||||||||||||||
|
Unfortunately I didn't have the chance to complete this ticket before my rotation ended. Here is the branch with the progress I made. Deduplication was implemented in MatchExpression::sortTree where the predicates are sorted. Performance benchmarks seemed to indicate this wouldn't create a regression. It looks like some tests need to be edited to account for the deduplication. | ||||||||||||||||||||
| Comment by Arun Banala [ 21/Oct/21 ] | ||||||||||||||||||||
|
That could be a reasonable solution. When merging $match stages with an $and with two (or more) children here, we could check if there are any duplicating predicates. | ||||||||||||||||||||
| Comment by David Percy [ 20/Oct/21 ] | ||||||||||||||||||||
|
Thanks! If it does turn out to be too expensive to combine redundant predicates in CanonicalQuery, I wonder if it would make sense to do when combining $match stages. | ||||||||||||||||||||
| Comment by Arun Banala [ 20/Oct/21 ] | ||||||||||||||||||||
|
david.percy There are two different issue here. | ||||||||||||||||||||
| Comment by David Percy [ 19/Oct/21 ] | ||||||||||||||||||||
|
Seems related to | ||||||||||||||||||||
| Comment by David Percy [ 19/Oct/21 ] | ||||||||||||||||||||
|
Should we do the same thing on a non-timeseries collection?
|