Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-86243

Allow SORT stage to exploit partially sorted inputs

    • Type: Icon: New Feature New Feature
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Labels:
      None
    • Query Optimization

      The agg-query-comparison_read_commands workload (categorized as a high-value workload) issues the following query:

       

      find({x: {$gt: 500}, y: {$gt: 500}}, {x: 1, y: 1, _id: 0}).sort({y: 1}) 

      with an index on

       

       

      {x: 1, y: 1}
      

      The plan that we generate looks like

       

       queryPlan: {
              stage: 'SORT',
              planNodeId: 3,
              sortPattern: { y: 1 },
              memLimit: 104857600,
              type: 'default',
              inputStage: {
                stage: 'PROJECTION_COVERED',
                planNodeId: 2,
                transformBy: { _id: false, x: true, y: true },
                inputStage: {
                  stage: 'IXSCAN',
                  planNodeId: 1,
                  keyPattern: { x: 1, y: 1 },
                  indexName: 'x_1_y_1',
                  isMultiKey: false,
                  multiKeyPaths: { x: [], y: [] },
                  isUnique: false,
                  isSparse: false,
                  isPartial: false,
                  indexVersion: 2,
                  direction: 'forward',
                  indexBounds: { x: [ '(500, inf.0]' ], y: [ '(500, inf.0]' ] }
                }
              }
            }, 

      The plan performs a blocking sort on y. This throws away a potentially useful property of the input stream; for each unique value of x, the input is already sorted on y. If each value of x has many values of y, a better plan may be to perform a merge sort of the existing result set of that of the current value of x we are considering in the index scan. But in the worse case where each value of x has only one document, this approach is worse than a blocking sort.

            Assignee:
            backlog-query-optimization [DO NOT USE] Backlog - Query Optimization
            Reporter:
            ben.shteinfeld@mongodb.com Ben Shteinfeld
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated: