[SERVER-40117] "$exit" aggregation stage (with $cond operator support) Created: 14/Mar/19 Updated: 06/Dec/22 |
|
| Status: | Backlog |
| Project: | Core Server |
| Component/s: | Aggregation Framework |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Trivial - P5 |
| Reporter: | Jonah Werre | Assignee: | Backlog - Query Optimization |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||
| Assigned Teams: |
Query Optimization
|
||||||||||||||||
| Participants: | |||||||||||||||||
| Description |
|
It would be nice if we could exit out of an Aggregate Pipeline. For example:
It would also be nice for debugging longer pipelines:
Thanks for your consideration. |
| Comments |
| Comment by Jonah Werre [ 02/May/19 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I think it should "$exit" the pipeline and return a value "up to now". | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Asya Kamsky [ 02/May/19 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Is the intent to exit aggregation with no output or with "up to now" output or something else? | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Jonah Werre [ 18/Mar/19 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Great, thanks Eric. Looking forward to see how it goes. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Eric Sedor [ 18/Mar/19 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
jonah@surveyplanet.com, we're assigning this ticket to the appropriate team to be evaluated against our currently planned work. Updates will be posted on this ticket as they happen. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Andy Schwerin [ 14/Mar/19 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Interesting. You're essentially looking to have a branch or data-steering in the pipeline. Coincidentally, one side of your does no special work, but that's not inherent I think. I wonder if your particular example might be implementable by writing two separate lookup stages, one that only matches if the question type is multiple choice and one if the question type is open-ended? Or by pushing the lookups into the facet stage, which already represents a kind of steering/branching. Even if such a workaround is available, the general problem of describing a pipeline with data-controlled steering (making it more of a directed graph instead of the tree it is today) is super interesting. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Jonah Werre [ 14/Mar/19 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
This is sudo code but the gist of it is that I'm tabulating answers for multiple choice and open-ended questions. For multiple choice questions I need all the answers so I can add them up a produce a summary of responses. But you can't add up open ended questions so I'm just showing the most recent 20. For example:
The result should look something like this:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Eric Sedor [ 14/Mar/19 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Thanks jonah@surveyplanet.com. We understand the potential value of $exit without $cond. But if you could further explain what you are looking to do with $exit+$cond and its potential benefit for your system, it will help us reason about this request. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Jonah Werre [ 14/Mar/19 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Thanks for the speedy reply Eric. To clarify I was thinking $exit would work differently than $match since the later stages of the pipeline would be ignored if $exit evaluates to true for any document. In the first example, $match would pass any books that where not 'science fiction' or 'fantasy' and sort them while $exit would skip the $sort all together if it found any documents that were of those genres. This would essentially create a conditional $sort and return 20 document unsorted or 20 documents sorted by "created" date. On the debugging front, I have used Compass in the past but it can be a little cumbersome keeping pipelines from Compass in sync with production code pipelines so I prefer to stay in my editor/IDE. It's just a personal preference that I'm sure a lot of developers share. As it stands now I often have to comment out big chunks of a pipeline to see the output. It would be nice to be able to "tap" the pipeline at any stage to see the result by adding an $exit stage. As an added benefit any $exit could be changed to false and left there for future debugging. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Eric Sedor [ 14/Mar/19 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
P.S., For now MongoDB Compass includes a pipeline builder feature that can be helpful debugging large pipelines | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Eric Sedor [ 14/Mar/19 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Thanks for your request jonah@surveyplanet.com. Since an aggregation pipeline streams documents and the example involves an $exit+$cond evaluating documents, can you clarify what your desired behavior would be when a document reaches an $exit stage that evaluates to true? Specifically, do you envision any documents that successfully passed an $exit:false condition before that point to still be passed through the later pipeline stages and returned as results? If so, then is it accurate to say that this would be similar to a $match stage that automatically excluded all documents after a single document failed to be matched? Any additional information about the use-cases this feature would service would be helpful! |