[SERVER-19983] Performance Regression in Mongo-perf Commands.DistinctWithIndexAndQuery Created: 14/Aug/15  Updated: 19/Sep/15  Resolved: 18/Aug/15

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: None
Fix Version/s: 3.1.7

Type: Improvement Priority: Major - P3
Reporter: David Daly Assignee: David Storch
Resolution: Done Votes: 0
Labels: mpreg
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to SERVER-15020 Implement explain for the distinct co... Closed
Backwards Compatibility: Fully Compatible
Sprint: QuInt 8 08/28/15
Participants:

 Description   

Seeing a regression on wiredTiger and MMAPv1 on Commands.DistinctWithIndexAndQuery on evergreen.

MMAPv1 results
WiredTiger results

Regression starts on commit: f9904bc

To recreate with mongo-perf:

python benchrun.py -t 8 -f testcases/*.js --includeFilter Commands.DistinctWithIndexAndQuery --out perf.json

To recreate with the mongo shell:

pre = function( collection ) {
            collection.drop();
            var docs = [];
            for ( var i = 0; i < 4800; i++ ) {
                docs.push( { x : 1 } );
                docs.push( { x : 2 } );
                docs.push( { x : 3 } );
            }
            collection.insert(docs);
            collection.ensureIndex( { x : 1 } );
        };
use test0
pre(db.test)
benchRun({"ops":[{"op":"command","tags":["distinct","command","core"],"ns":"test0","command":{"distinct":"Commands_DistinctWithIndexAndQuery0","key":"x","query":{"x":1}},"safe":false,"w":0,"j":false,"writeCmd":true}],"seconds":5,"host":"127.0.0.1:27017","parallel":8})



 Comments   
Comment by Githook User [ 18/Aug/15 ]

Author:

{u'username': u'dstorch', u'name': u'David Storch', u'email': u'david.storch@10gen.com'}

Message: SERVER-19983 avoid unnecessary IndexBounds serialization in DistinctScan query execution
Branch: master
https://github.com/mongodb/mongo/commit/0ddad55a110eb939f8c3c2fa62d3432f93c65f47

Comment by David Daly [ 17/Aug/15 ]

Yeah, 1% is definitely within error at this point.

Comment by David Storch [ 17/Aug/15 ]

Thanks David. To get back the 3.6%, we can just move the expensive call so that it will get called for explain, but will not get called for a regular (not explained) distinct command. As for the remaining ~1%, is this within the bounds of error for perf patch builds? I didn't see anything else in the patch that looked suspect.

Comment by David Daly [ 17/Aug/15 ]

That patch got a 3.63% performance improvement, which is most of the 4-5% regression we saw.

Comment by David Daly [ 14/Aug/15 ]

Patch build here: https://evergreen.mongodb.com/version/55ce111b3ff1221da80002a9_0

Comment by David Daly [ 14/Aug/15 ]

Thanks david.storch. Sounds like a good experiment. Will set it up and see what it shows.

Comment by David Storch [ 14/Aug/15 ]

I'm surprised that there would be any noticeable performance difference associated with this change. If I'm not mistaken, this is the only part of that change which is exercised by the test:

+    _specificStats.isMultiKey = _params.descriptor->isMultikey(getOpCtx());
+    _specificStats.isUnique = _params.descriptor->unique();
+    _specificStats.isSparse = _params.descriptor->isSparse();
+    _specificStats.isPartial = _params.descriptor->isPartial();
+    _specificStats.indexBounds = _params.bounds.toBSON();
+    _specificStats.direction = _params.direction;

Nothing here should be too expensive, though if I was going to point the finger at any of these calls it would be _params.bounds.toBSON(). I recommend running a perf patch build with that line commented out to see if that makes the regression go away.

Generated at Thu Feb 08 03:52:46 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.