[SERVER-27150] Performance degradation with $sort on indexed field Created: 22/Nov/16  Updated: 06/Dec/22  Resolved: 22/Nov/16

Status: Closed
Project: Core Server
Component/s: Aggregation Framework
Affects Version/s: 3.3.12, 3.4.0-rc3
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Vyacheslav Stroy Assignee: Backlog - Query Team (Inactive)
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Windows 10, Ubuntu Linux


Issue Links:
Duplicate
duplicates SERVER-7568 Aggregation framework favors non-bloc... Closed
Assigned Teams:
Query
Operating System: ALL
Steps To Reproduce:

var c = db.c;
c.drop();
 
c.ensureIndex( { a: 1 } );
 
//insert 100 000 documents
for (var thousands=0; thousands < 100; thousands++) {
	for (var i=0; i < 1000; i++) {
		var bulk = c.initializeUnorderedBulkOp();
		bulk.insert({ a: i, b: new Date()} );
		bulk.execute();
	}
}
 
//takes 2ms to execute
c.aggregate([ { $match: { a: 1 } }, { $sort: { b : 1 } } ]);
 
//create index on "b" field
c.ensureIndex( { b: 1 } );
 
//now the same query takes more than 90ms
c.aggregate([ { $match: { a: 1 } }, { $sort: { b : 1 } } ]);
 
//let's add a composite index on both fields
c.ensureIndex( { a: 1, b: 1 } );
 
//aggregation with composite index takes 2ms
c.aggregate([ { $match: { a: 1 } }, { $sort: { b : 1 } } ]);

Query plan of slow query:

{ 
    "stages" : [
        {
            "$cursor" : {
                "query" : {
                    "a" : NumberInt(1)
                }, 
                "sort" : {
                    "b" : NumberInt(-1)
                }, 
                "queryPlanner" : {
                    "plannerVersion" : NumberInt(1), 
                    "namespace" : "PanelRider.c", 
                    "indexFilterSet" : false, 
                    "parsedQuery" : {
                        "a" : {
                            "$eq" : NumberInt(1)
                        }
                    }, 
                    "winningPlan" : {
                        "stage" : "FETCH", 
                        "filter" : {
                            "a" : {
                                "$eq" : NumberInt(1)
                            }
                        }, 
                        "inputStage" : {
                            "stage" : "IXSCAN", 
                            "keyPattern" : {
                                "b" : 1.0
                            }, 
                            "indexName" : "b_1", 
                            "isMultiKey" : false, 
                            "isUnique" : false, 
                            "isSparse" : false, 
                            "isPartial" : false, 
                            "indexVersion" : NumberInt(1), 
                            "direction" : "backward", 
                            "indexBounds" : {
                                "b" : [
                                    "[MaxKey, MinKey]"
                                ]
                            }
                        }
                    }, 
                    "rejectedPlans" : [
 
                    ]
                }
            }
        }
    ], 
    "ok" : 1.0
}

Participants:

 Description   

The $sort operator on indexed field slows down aggregation pipeline execution up to 200 times when used in conjunction with $match on another indexed field.

Performance degrades linearly on collection growth. It becomes a huge problem after 1 000 000 docs. Aggregations on my production environment with 9 000 000 docs collection take more than 4s.

It seems that problem itself is related to the non-optimal index usage.

I found only two options:

  • Remove the index on the sorted field (unacceptable in my case).
  • Add the compound index on both fields (not the best decision in case of multiple queries on various fields with different sort order).


 Comments   
Comment by Charlie Swanson [ 22/Nov/16 ]

Hi kreig,

It looks like you've run into SERVER-7568. I'm closing this issue as a duplicate, and I would encourage you to vote for that issue and watch it for updates.

Thanks

Comment by Ramon Fernandez Marina [ 22/Nov/16 ]

Thanks for the detailed reproducer kreig. I'm able to observe the behavior your describe in 3.4.0-rc3, but not in 3.4.0-rc4 – would it be possible for you to upgrade and confirm whether you're still experiencing this issue?

EDIT: never mind, that was pilot error on my part. I can confirm the same behavior is present on 3.4.0-rc4.

Thanks,
Ramón.

Generated at Thu Feb 08 04:14:18 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.