[SERVER-24981] $project-$limit optimization has bad repercussion on pipeline splitting Created: 11/Jul/16  Updated: 17/Apr/18  Resolved: 07/Dec/17

Status: Closed
Project: Core Server
Component/s: Aggregation Framework
Affects Version/s: None
Fix Version/s: 3.7.1

Type: Improvement Priority: Major - P3
Reporter: Antoine Hom Assignee: Janna Golden
Resolution: Done Votes: 0
Labels: performance
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File explain_plan.log     File explain_plan_redact.log    
Issue Links:
Backports
Documented
is documented by DOCS-11102 Docs for SERVER-24981: $project-$limi... Closed
Related
is related to SERVER-24978 Second batches in aggregation framewo... Closed
Backwards Compatibility: Fully Compatible
Backport Requested:
v3.6
Sprint: Query 2017-11-13, Query 2017-12-04, Query 2017-12-18
Participants:

 Description   

The new $project-$limit optimization in 3.2 might make the pipeline to be split much earlier than before (because it will split the pipeline at the limit step).

I'm attaching two explain plan of queries, one which uses the optimization and one that doesn't because I added a $redact: $$KEEP just before the $limit.
In the case of this query much more fields are sent to the mergerPart because of the splitting and is triggering a very bad behavior with second batches of aggregation queries which will be described in another ticket.

I think it would be good to take into consideration pipeline splitting when doing those optimization (in addition there is no $sort stage which would benefit from having the $limit moved up)

Cheers,
Antoine



 Comments   
Comment by Githook User [ 07/Dec/17 ]

Author:

{'name': 'jannaerin', 'username': 'jannaerin', 'email': 'golden.janna@gmail.com'}

Message: SERVER-24981 Rewrite $limit optimization
Branch: master
https://github.com/mongodb/mongo/commit/bbebcbfde994ec14b9fabfe17779cfb5adcda211

Comment by David Storch [ 12/Oct/17 ]

tess.avitabile this sounds reasonable to me. I think for now we can move this out to 3.7 Desired, but this could be a good thing for janna.golden to work on after she has a little bit of ramp time on the query team.

Comment by Tess Avitabile (Inactive) [ 11/Oct/17 ]

I like charlie.swanson's suggestion to have $sort look ahead in the pipeline past stages that preserve the number of documents for a $limit to coalesce with (where by ahead, I mean [{$sort: ...}, ..., {$limit: ...}]). There is no benefit to swapping $limit before $project except when it can find a $sort to coalesce with. And there is no harm in swapping $limit before $project when there is a $sort earlier in the pipeline, because the pipeline will be split at $sort, so the $project would not be performed on the shards anyway. asya's suggestion to duplicate the $limit when there is an intervening stage that increases the number of documents seems like a good extension.

I do not think we can say whether it is always better/worse to swap $skip before $project. On a single shard, it is clearly always better. But in a sharded cluster, it depends on the expensiveness of the $project vs. the $project's reduction of the document size. Since we cannot determine whether a swap is an improvement, and there are no reported issues about the current optimization, I recommend we leave it as is.

Comment by Charlie Swanson [ 09/Sep/16 ]

I have one idea of how to fix this:
Our current optimization puts $limit in front of $project in hopes that it will later coalesce with a $sort. Instead, we could have $sort be responsible for looking ahead in the pipeline to try to find a $limit. The $sort could keep looking past anything like $project, $addFields, etc. that do not change the number of documents in the pipeline. This will keep as much work on the shards in parallel, and still allow $sort to still find a $limit.

The $skip optimization might suffer from a similar problem to the one described here, and I'm not sure if/how we want to address that. The $skip/$project swap was meant to reduce the amount of work done transforming documents within $project. I'm tempted to think that this is still a worthwhile optimization. If so, we'd want to add some special logic after splitting the pipeline to see if the next stage(s) is a $project (or again something like $addFields). If there is at least one such stage, we can move it/them back to the parallel part of the shards.

If we do that second piece of work, we might not need to do the first, since the same strategy would work for $limit.

Comment by Ramon Fernandez Marina [ 11/Jul/16 ]

Thanks for your reports antoine.hom@amadeus.com, both SERVER-24978 and this ticket have been sent to the Query team for consideration. Please continue to watch both tickets for updates.

Regards,
Ramón.

Comment by Antoine Hom [ 11/Jul/16 ]

The query without redact timed out in 10+ minutes in our cluster. (because of SERVER-24978)
The one with the redact step finished in 1minute.

Generated at Thu Feb 08 04:07:56 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.