[SERVER-86243] Allow SORT stage to exploit partially sorted inputs Created: 05/Feb/24  Updated: 06/Feb/24

Status: Backlog
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: New Feature Priority: Major - P3
Reporter: Ben Shteinfeld Assignee: Backlog - Query Optimization
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Query Optimization
Participants:

 Description   

The agg-query-comparison_read_commands workload (categorized as a high-value workload) issues the following query:

 

find({x: {$gt: 500}, y: {$gt: 500}}, {x: 1, y: 1, _id: 0}).sort({y: 1}) 

with an index on

 

 

{x: 1, y: 1}

The plan that we generate looks like

 

 queryPlan: {
        stage: 'SORT',
        planNodeId: 3,
        sortPattern: { y: 1 },
        memLimit: 104857600,
        type: 'default',
        inputStage: {
          stage: 'PROJECTION_COVERED',
          planNodeId: 2,
          transformBy: { _id: false, x: true, y: true },
          inputStage: {
            stage: 'IXSCAN',
            planNodeId: 1,
            keyPattern: { x: 1, y: 1 },
            indexName: 'x_1_y_1',
            isMultiKey: false,
            multiKeyPaths: { x: [], y: [] },
            isUnique: false,
            isSparse: false,
            isPartial: false,
            indexVersion: 2,
            direction: 'forward',
            indexBounds: { x: [ '(500, inf.0]' ], y: [ '(500, inf.0]' ] }
          }
        }
      }, 

The plan performs a blocking sort on y. This throws away a potentially useful property of the input stream; for each unique value of x, the input is already sorted on y. If each value of x has many values of y, a better plan may be to perform a merge sort of the existing result set of that of the current value of x we are considering in the index scan. But in the worse case where each value of x has only one document, this approach is worse than a blocking sort.


Generated at Thu Feb 08 06:59:45 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.