Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-62952

Order of $sort and $limit aggregation

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Labels:
      None
    • ALL

      Does the order of $skip and $limit matter in pipeline?

      I use mongoose aggregation which run a pipeline of operations. With skip and limt appended at end.

      projectModel.aggregate(pipeline)
        .sort({ updatedAt: -1 })
        .skip(skip)
        .limit(limit)

      Our pipeline simply looks like

      $match(userId) > $lookup(html_collection) > $lookup(records_collection) > $sort(updatedAt) > $skip(mongoose) > $limit(mongoose)

      What I observed during pagination that $limit is being respected but $skip is happening only at end of pipeline. For example: 

      page1: skip = 0, limit = 10
      The number of documents as per the .explain() clearing $match stage is 10.

      page 2: skip = 10, limit = 10
      The number of documents clearing $match stage is 20 (skip + limit), and 20 documents made it to next stage. $lookup is made on 20 documents. Slowing down our pipeline and it is at the last stage when $skip works to discard first 10 documents. Here we wasted work for first 10 documents.

      This was causing an issue in our pipeline and it was getting slow for pagination. 

      QUESTION: Is it expected behaviour? And  we were using it in an unintented way?

      SOLUTION: What we ended up doing was to move $limit followed by $skip.
      The $limit = skip + limit, $skip = skip. What we thought that limiting documents as limit = skip + limit will fetch documents and $skip at next stage would reject unneccessary documents, thereby giving $lookup stages only the expected documents.

      page 1: skip = 0, limit = 10
      $limit = 0 + 10 = 10 followed by $skip = 0

      page 2: skip = 10, limit = 10
      $limit = 10 + 10 followed by $skip = 10

      our pipeline now looks like:

      $match(userId) > $sort(updatedAt) > $limit(limit + skip) > $skip (skip) > $lookup(html_collection) > $lookup(records_collection)

      Here is collection scehema for your refernce:

       

      PROJECT COLLECTION
      
      {
       id: ObjectId,
       records: [ObjectId],
       userId: ObjectId
      }
      
      RECORDS COLLECTION
      
      {
       id: ObjectId,
       text: string
      }
      
      HTML COLLECTION
      
      {
        id: ObjectId,
        html: string
      }

      QUESTION:

      1. The behabiour is intened or something is wrong here.
      2. The solution we came up with is that correct and will it scale, as I think for last page there are too many documents clearing $match stage, but that is also something that mongodb internall did for first case pagination 2 right? 

       

       

       

            Assignee:
            edwin.zhou@mongodb.com Edwin Zhou
            Reporter:
            aniketjha898@gmail.com Aniket Jha
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: