[SERVER-20609] Performance Regression from 3.0.6 for Mongo-Perf Commands.CountsFullCollection Created: 17/Sep/15 Updated: 24/Nov/15 Resolved: 17/Nov/15 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Performance |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | David Daly | Assignee: | Adam Midvidy |
| Resolution: | Done | Votes: | 0 |
| Labels: | mpreg | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||
| Operating System: | ALL | ||||||||||||
| Sprint: | Platform B (10/30/15), Platform C (11/20/15) | ||||||||||||
| Participants: | |||||||||||||
| Description |
|
15% (MMAP) - 20% (wiredTiger) regression on Commands.CountsFullCollection from 3.0.6 Regression occurs somewhere between 3.1.3 and 3.1.4 in both cases. Simple repro in the shell:
|
| Comments |
| Comment by Adam Midvidy [ 17/Nov/15 ] | |
|
I managed to reduce some of the regression, but as we are doing more work (read after optime, readConcern checking, etc). on the command path now, there is going to be some penalty. | |
| Comment by Githook User [ 10/Nov/15 ] | |
|
Author: {u'username': u'amidvidy', u'name': u'Adam Midvidy', u'email': u'amidvidy@gmail.com'}Message: | |
| Comment by Githook User [ 10/Nov/15 ] | |
|
Author: {u'username': u'amidvidy', u'name': u'Adam Midvidy', u'email': u'amidvidy@gmail.com'}Message: | |
| Comment by Githook User [ 30/Oct/15 ] | |
|
Author: {u'username': u'amidvidy', u'name': u'Adam Midvidy', u'email': u'amidvidy@gmail.com'}Message:
| |
| Comment by Githook User [ 30/Oct/15 ] | |
|
Author: {u'username': u'amidvidy', u'name': u'Adam Midvidy', u'email': u'amidvidy@gmail.com'}Message: | |
| Comment by Githook User [ 30/Oct/15 ] | |
|
Author: {u'username': u'amidvidy', u'name': u'Adam Midvidy', u'email': u'amidvidy@gmail.com'}Message: | |
| Comment by Githook User [ 23/Oct/15 ] | |
|
Author: {u'username': u'amidvidy', u'name': u'Adam Midvidy', u'email': u'amidvidy@gmail.com'}Message: | |
| Comment by Adam Midvidy [ 23/Oct/15 ] | |
|
Through the various optimizations on this ticket I've realized a 7% gain in CountsFullCollection performance: I don't think we will be able to make the full regression go away as we are now doing more work along the command path than we previously were. | |
| Comment by David Daly [ 20/Oct/15 ] | |
|
adam.midvidy the perf loop uses the shell built from the same githash as the mongod. We can setup an experiment to test that. Also note that for the baseline comparison patch builds, we have cherry-picked bench.cpp changes onto our patch build for consistency. Other driver/client changes are not picked up. | |
| Comment by Adam Midvidy [ 20/Oct/15 ] | |
|
I have a theory that some of the degradation may have occurred as a result of changes I made in the clientdriver. That is, much of the slowness may be from making benchrun slower, not the server. david.daly, does mongo-perf always use the shell binary built for a given commit to run its benchmarks? | |
| Comment by David Daly [ 02/Oct/15 ] | |
|
The above checkin is to mask the performance regression until this ticket is fixed. The commit should be reverted when this ticket is fixed. | |
| Comment by David Daly [ 02/Oct/15 ] | |
|
I tried looking at callgraph traces on context switches. Essentially the same, and approximately the same counts also. | |
| Comment by Githook User [ 24/Sep/15 ] | |
|
Author: {u'username': u'dalyd', u'name': u'dalyd', u'email': u'david.daly@mongodb.com'}Message: | |
| Comment by Martin Bligh [ 23/Sep/15 ] | |
|
odd ... maybe callgraph trace on context switches? | |
| Comment by David Daly [ 22/Sep/15 ] | |
|
This regression happens over multiple commits between 3.1.3 and 3.1.4. I tried doing a bisect, but was unable to find one big drop off, instead of multiple small ones. This can be reduced to basic request handling. The regression is insensitive to the size of the collection, and can be reproduced using the ping command. The regression is the same single threaded or with 8 threads.
I did some profiling of the two builds. Differences appear in library/kernel routines like do_select, dequeue_entity, __schedule, fget_light, cpuacct_charge. |