[DOCS-13989] Investigate changes in SERVER-49024: Disallow $lookup uncorrelated pipeline caching for stages containing $sample/$rand/$sampleRate Created: 17/Nov/20 Updated: 13/Nov/23 Resolved: 16/Aug/21 |
|
| Status: | Closed |
| Project: | Documentation |
| Component/s: | manual, Server |
| Affects Version/s: | None |
| Fix Version/s: | 4.9.0, Server_Docs_20231030, Server_Docs_20231106, Server_Docs_20231105, Server_Docs_20231113 |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | Backlog - Core Eng Program Management Team | Assignee: | Jason Price |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Participants: | |||||||||
| Days since reply: | 2 years, 25 weeks, 2 days ago | ||||||||
| Epic Link: | DOCSP-15042 | ||||||||
| Story Points: | 3 | ||||||||
| Description |
DescriptionDownstream Change Summary Previously, using $sample in an uncorrelated subquery had inconsistent behavior: the $sample would either be cached or re-run depending on the size of the output. This would also affect $rand. Now, $sample and $rand don't count as "uncorrelated", so $lookup always re-runs them. Description of Linked TicketThe $sample stage returns a different sample every time it runs. $lookup sometimes re-runs the inner pipeline per outer document, and sometimes runs it only once. This makes the behavior of $sample inside $lookup hard to predict. For example, this query runs the sub-pipeline only once, resulting in the same sample chosen every time:
On the other hand, this query re-runs the sub-pipeline, choosing a different sample per outer document:
Since we consider DocumentSourceSequentialDocumentCache to be an optimization, there could be other exceptions to this rule. For example, if you add a dummy correlation hoping to force the inner pipeline to re-run, it can get optimized out. This ticket will make changes to consider any $sample stage or stage containing a $rand or $sampleRate expression to be ineligible for uncorrelated pipeline caching. Scope of changesImpact to Other DocsMVP (Work and Date)Resources (Scope or Design Docs, Invision, etc.) |
| Comments |
| Comment by Githook User [ 16/Aug/21 ] |
|
Author: {'name': 'jason-price-mongodb', 'email': 'jshfjghsdfgjsdjh@aolsdjfhkjsdhfkjsdf.com'}Message: |