[SERVER-62952] Order of $sort and $limit aggregation Created: 25/Jan/22  Updated: 02/Jun/22  Resolved: 31/Jan/22

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Aniket Jha Assignee: Edwin Zhou
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Operating System: ALL
Participants:

 Description   

Does the order of $skip and $limit matter in pipeline?

I use mongoose aggregation which run a pipeline of operations. With skip and limt appended at end.

projectModel.aggregate(pipeline)
  .sort({ updatedAt: -1 })
  .skip(skip)
  .limit(limit)

Our pipeline simply looks like

$match(userId) > $lookup(html_collection) > $lookup(records_collection) > $sort(updatedAt) > $skip(mongoose) > $limit(mongoose)

What I observed during pagination that $limit is being respected but $skip is happening only at end of pipeline. For example: 

page1: skip = 0, limit = 10
The number of documents as per the .explain() clearing $match stage is 10.

page 2: skip = 10, limit = 10
The number of documents clearing $match stage is 20 (skip + limit), and 20 documents made it to next stage. $lookup is made on 20 documents. Slowing down our pipeline and it is at the last stage when $skip works to discard first 10 documents. Here we wasted work for first 10 documents.

This was causing an issue in our pipeline and it was getting slow for pagination. 

QUESTION: Is it expected behaviour? And  we were using it in an unintented way?

SOLUTION: What we ended up doing was to move $limit followed by $skip.
The $limit = skip + limit, $skip = skip. What we thought that limiting documents as limit = skip + limit will fetch documents and $skip at next stage would reject unneccessary documents, thereby giving $lookup stages only the expected documents.

page 1: skip = 0, limit = 10
$limit = 0 + 10 = 10 followed by $skip = 0

page 2: skip = 10, limit = 10
$limit = 10 + 10 followed by $skip = 10

our pipeline now looks like:

$match(userId) > $sort(updatedAt) > $limit(limit + skip) > $skip (skip) > $lookup(html_collection) > $lookup(records_collection)

Here is collection scehema for your refernce:

 

PROJECT COLLECTION
 
{
 id: ObjectId,
 records: [ObjectId],
 userId: ObjectId
}

 
RECORDS COLLECTION
 
{
 id: ObjectId,
 text: string
}
 
HTML COLLECTION
 
{
  id: ObjectId,
  html: string
}

QUESTION:

  1. The behabiour is intened or something is wrong here.
  2. The solution we came up with is that correct and will it scale, as I think for last page there are too many documents clearing $match stage, but that is also something that mongodb internall did for first case pagination 2 right? 

 

 

 



 Comments   
Comment by Edwin Zhou [ 31/Jan/22 ]

Hi aniketjha898@gmail.com,

Thanks for your report. This behavior appears expected given the order of operations displayed in your query. Using skip and limit last would mean that the $lookup stages will need to first lookup all of the results of the $match stage. You've mentioned this key point:

...[skip and limit] at next stage would reject unneccessary documents, thereby giving $lookup stages only the expected documents.

For issues and questions like this in the future, we'd like to encourage you to start by asking our community for help by posting on the MongoDB Developer Community Forums.

If the discussion there leads you to suspect a bug in the MongoDB server, then we'd want to investigate it as a possible bug here in the SERVER project.

Best,
Edwin

Comment by Aniket Jha [ 28/Jan/22 ]

Hey team

If possible can you verify the issue and pick to see if it is a bug or it is expected behavior as soon as possible. I'm watching this thread and can provide any more details if required. 😊 

Generated at Thu Feb 08 05:56:30 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.