[SERVER-50443] When j:false, performance degradation from 4.0 to 4.2 for 1-node replica sets Created: 21/Aug/20 Updated: 07/Mar/22 Resolved: 07/Mar/22 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Performance |
| Affects Version/s: | 4.2.8 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Martin Mlynář | Assignee: | Kelsey Schubert |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||
| Operating System: | ALL | ||||||||||||
| Steps To Reproduce: | Tests are run using https://github.com/idealo/mongodb-performance-test.git: java -jar ./latest-version/mongodb-performance-test.jar -m insert -o 10000000 -t 100 -db test -c perf -port 28888 -h mongoserver When running 4.0.6 standalone server with: I got: When I start as replica set (no other replicas added, just start with replSet and call rs.initiate(): I got: Huge drop, but fine, RS is enabled. Now let’s try it with 4.2.8: {{docker run -ti -v /srv/mongo/test-mongo:/srv/mongodb/ --net mean rate = 167455.01 calls/second and same with replset enabled: mean rate = 10868.81 calls/second Huge drop to 10kcalls/second.
I've made git bisect and identified these commits which drags performance down the most: b2384a85f62431d88ac01067ae4d1d3dbb4fede5 mean rate = 103095,21 calls/second 4cee07d8a97bb0663e7bfbc3f2e1fbf539140adf is the first bad commit
:040000 040000 948822d607a4ae02c4b18c9eac471e123fae892d 66d95e2b3a7fbdfda5aff07b58799dc8af359f3e M jstests drags down to mean rate = 22841,24 calls/second
Second: >>>>>> bcaa4eb01c mean rate = 37827,93 calls/second e4b08de746ef4e472bfdd4790fdb2b89d5a62f1a is the first bad commit
:040000 040000 e98acae85e9e9e09dc773dd464be513376791c96 bda8f41ecf04f82c70c8c2bc42dcdc3375b9e656 M jstests >>>>>>>> bcde4c4fd3 mean rate = 26577,37 calls/second
3rd: 86efdd6b047927d3a37577a6033bacc709c7ae89 is the first bad commit
This review does several related things:
:040000 040000 cc0ee8580816f86084bcdab28f8ae92ec9da2bf1 a77afd98d1fa56c79b71de4ec7835e4905a9c664 M jstests 574b62017cad311690eba012222459ff5e41e8c5 mean rate = 11348,16 calls/second |
||||||||||||
| Participants: | |||||||||||||
| Linked BF Score: | 0 | ||||||||||||
| Description |
|
Hi all, I’m currently running mongodb 4.0.6 and playing with upgrade to 4.2.8. After some benchmarks I’ve noticed really strange and huge performance degradation. Can anyone give me some insight what’s going on? I've made |
| Comments |
| Comment by Kelsey Schubert [ 07/Mar/22 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Sorry for the delay getting back to you. We have similar performance tests in our CI infrastructure, which do not show this regression. We also haven't had any other user reports indicating that this is a significant area of concern. Given the age of these releases, if we were able to identify the root cause why you are seeing a performance regression and fix would not be likely as it may risk destabilizing the branch. Thanks, | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Eric Sedor [ 21/Aug/20 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
nextsux@gmail.com, as a quick note, j:true is going to end up implying at least w:1, from a client perspective. See here | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Martin Mlynář [ 21/Aug/20 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I've done some more tests: I'm using these docker-compose files for 4.0 and 4.2:
Both clusters are initialized with: rs.initiate() For 4.0 I'm getting: For 4.2: JOURNALED:
4.0: mean rate = 24108,72 calls/second UNACKNOWLEDGED:
So I do not think it's just problem with a special case of 1 node RS. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Martin Mlynář [ 21/Aug/20 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Hi Eric, first of all thank you for looking at this issue so fast! I'll re-run tests with master-slave-arbiter RS later this evening and let you know of my results. Unfortunately I'm a little bit skeptical as I've hit this issue with my full setup (2 shards, each consisting of master slave and arbiter) at first. Good catch about the w: ACKNOWLEDGED, I've missed that. I'll also try same set of tests with patched tool with writeConcern set to JOURNALED and UNACKNOWLEDGED (which is closest to my real workload) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Eric Sedor [ 21/Aug/20 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
This is a regression specifically for the combination of using a 1-node replica set and performing writes with j:false. j:false is the default journaling behavior implied by writeConcern.ACKNOWLEDGED, which is used by this test. You should see improvement by using j:true or testing on a larger replica set. Unfortunately in this case, testing on a 1-node replica set is not representative of what you would see in commonly deployed 3-node replica sets on 4.2. I'm going to pass this ticket on to an appropriate team for consideration. Eric |