[SERVER-40759] New Agg Metadata Source to Generate A Single Empty Document Created: 22/Apr/19 Updated: 14/Dec/22 |
|
| Status: | Backlog |
| Project: | Core Server |
| Component/s: | Querying |
| Affects Version/s: | 4.0.0 |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Minor - P4 |
| Reporter: | Patrick Meredith | Assignee: | Backlog - Query Execution |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Assigned Teams: |
Query Execution
|
||||||||
| Participants: | |||||||||
| Description |
|
There are quite a few situations in which {$limit: 0} would be useful, for instance, currently in the BI-Connector we use $collStats to inject a single document when we need to pushdown a subquery that needs exactly one projected result out. However, $collStats has some issues, and we think $facet might be a better choice. But in this case we would prefer {$limit: 0} to {$limit: 1}, since we literally don't care about the result, so for example:
We would like to push down the subquery select "hello" as:
At some future point this could even be a optimized to not look at any documents (currently on a collection of 100K docs, {$limit: 1} will look at 853 of them in a $facet on server 4.0), but for now, simply changing the error condition to be negative from non-positive would be an improvement. |
| Comments |
| Comment by Kevin Pulo [ 13/May/19 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Yes. Like $collStats and friends, this would only really make sense at the start of a pipeline, and it would be fine if that was enforced (to prevent pipeline coding errors from causing the results of earlier from accidentally being lost), though this might affect usage in some situations (eg. views). The direct motivation is that currently you can get a single empty document from somewhere else, but they are all hacks. The possibilities are:
The indirect motivation is sub-queries. In the case of the BIC, this is a direct need (I believe — I'm no expert on SQL subqueries). The $lookup stage already serves the purpose just fine, as long as a suitable input document can be crafted for it. In my case, I want to issue 1000 queries to the server, each of the form:
without having to do 1000 round-trips to the server (which is my only other option). While a "bulk query" feature could be implemented to support this, again the situation is such that $lookup does what I need, as long as I can craft the input to it. Here's a mongo shell implementation of this idea. Basically, it's possible, but clumsy, to do this:
and so the ask here is to be able to instead do:
which is clearly better in a variety of ways. I'm deliberately not asking for more substantial/advanced features (eg. bulk-query or "actual" sub-query) because those would be a lot of work, whereas once the synthetic document has been obtained, the existing tools available ($lookup, $project, $addFields, $replaceRoot, etc) are sufficient to achieve the desired goal without too much hoop-jumping. For reference, I expect the proposed $emptyDocument stage would have a trivial core implementation pretty close to this:
which is why I'm trying to minimise the amount of supporting code (eg. to parse and use a given literal document), because the idea is just some simple and easy sugar to replace the existing clumsy technique. Other possible solutions could be things like the below (with hopefully obvious semantics), all of which I would expect be more work for not much benefit.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Asya Kamsky [ 02/May/19 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
You're saying you want a single document synthesized from a stage regardless of how many documents are passed into it? If that's the case why can't you use $collStats (you say "issues" but you don't say what they are). Rather than requesting specific implementation, can you please describe your use case (completely) so we can figure out the best way to address it long term? | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Kevin Pulo [ 23/Apr/19 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I've also had a similar need in the past. In that case, I used a collection that I knew would have at least one document, and then started my pipeline with
to get a single empty document, that I then populated as necessary for the sub-query. Rather than mess around like this, or with similar $collStats or $facet hacks, would it be better to have a simple "$emptyDocument" DocumentSource stage that just outputs a single empty document? You could then easily $project in whatever fields you like, as normal. And if you want multiple created documents, then you could $project an array and $unwind it. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Kelsey Schubert [ 22/Apr/19 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Opposite of |