[SERVER-24710] Optimize $sample+$project Created: 22/Jun/16 Updated: 06/Dec/22 Resolved: 22/Jun/16 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Aggregation Framework |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Minor - P4 |
| Reporter: | Ross Lawley | Assignee: | Backlog - Query Team (Inactive) |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||
| Assigned Teams: |
Query
|
||||||||||||||||||||
| Participants: | |||||||||||||||||||||
| Description |
|
When performing a pipeline doing a sample then project I noticed it was much slower than doing a project then sample. Could this be optimized? In a similar fashion as to: https://docs.mongodb.com/manual/core/aggregation-pipeline-optimization For example with the MovieLens dataset ~1million documents: Pipeline: $sample + $project _id: 76120 ms |
| Comments |
| Comment by Ross Lawley [ 22/Jun/16 ] |
|
Duplicate of Tested on 3.2.7 mongorestoring data. $sample + $project is much slower than $project + $sample until the mongod is restarted. Then its much faster. |
| Comment by Ross Lawley [ 22/Jun/16 ] |
|
Looks to be a duplicate of |