[SERVER-27045] Over 40% regression on mongodb using YCSB workload - follow on to server-26674 Created: 15/Nov/16  Updated: 23/Jan/17  Resolved: 05/Dec/16

Status: Closed
Project: Core Server
Component/s: Performance, WiredTiger
Affects Version/s: 3.4.0-rc3
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Lilian Romero Assignee: Michael Cahill (Inactive)
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File 2016-11-21T17-40-57Z.tar.gz     File 2016-11-22T16-37-51Z.tar.gz     PNG File cache-3-3-8.png     PNG File cache-3-4-0-rc3.png     File diagnostic.data.338.tar.gz     File diagnostic.data_3.4cr3.tar.gz     File stack.high.mongo     File stack.low.mongo    
Issue Links:
Related
related to WT-3040 YCSB can generate large maximum in-me... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Steps To Reproduce:

How to reproduce:
use YCSB 6.0
Configuration:
fieldcount=10
fieldlength=10
recordcount=120000000
operationcount=100000000
workload=com.yahoo.ycsb.workloads.CoreWorkload
readallfields=true
readproportion=0.5
updateproportion=0.5
scanproportion=0
insertproportion=0

System used: Power with 256GB of memory.

Sprint: Storage 2016-12-12
Participants:

 Description   

We opened ticket SERVER-26674 on a performance issue. While 3.4rc3 fixed the issue reported on SERVER-26674, when we tried the same YCSB test with a different set of parameters we are running into performance problems.

version         Results
----------------------------------------
3.3.8           302280.29 ops/sec
3.4.rc3         130755.69 ops/sec 

The vmstat with version 3.4rc3 shows high and lows where as with 3.3.8 it is very even.



 Comments   
Comment by Jim Van Fleet [ 29/Nov/16 ]

Well, 5% is almost within measurement variance. I ran agan today and got profile, etc data – I did not see the symptoms from earler (widely varying cpu usage and calls to eviction). Throughput was down a bit because of the data collection, but I think we are ok for now. One thought, though, is to make the parameters we used to be defaults - perhaps/

OK to close

Comment by Alexander Gorrod [ 28/Nov/16 ]

Thanks for the new data jvf; from the numbers you posted the most recent result is 5% slower than the numbers achieved on the 3.3.8 development release.

I have analyzed the diagnostic data you uploaded and throughput is very stable on this most recent run. I could not diagnose any further bottle necks related to WiredTiger cache management. There have been a lot of changes that improve correctness and functionality between 3.3.8 and 3.4.0 - as such it would be difficult to isolate the causes of this performance difference. Are you happy for this ticket to be closed now, with the knowledge that MongoDB will continue to improve performance in the future, and that YCSB is one of the benchmarks we use to track that improvement?

Comment by Jim Van Fleet [ 22/Nov/16 ]

diagnostic data

Comment by Jim Van Fleet [ 22/Nov/16 ]

ran with those changes, improved a little to 286722.60 ops/sec. Diagnostic data attached.

Comment by Michael Cahill (Inactive) [ 22/Nov/16 ]

jvf thanks for the additional information.

A couple of things I noticed about the workload:

  • there appears to be I/O saturation every 30 seconds (on devices sdl through sdr) leading to slow write operations to the filesystem. These aren't obviously correlated to operations in MongoDB so are probably caused by Linux flushing dirty pages. You might want to try setting vm.dirty_background_ratio = 1 (or some other small number) to see if that helps smooth out this I/O.
  • this YCSB workload is doing random updates with a uniform request distribution. That means it will attempt to dirty all the data in the collection all the time. There appears to be about 36GB uncompressed in cache and the default settings try to reduce the amount of dirty data a checkpoint has to process to 5% of the cache size, or 6.4GB in this case. That scrubbing step is the cause of the remaining time difference compared with 3.3.8.

You should be able to restore the 3.3.8 behavior with the following settings:

mongod --wiredTigerEngineConfigString="eviction_dirty_trigger=95,eviction_dirty_target=80"

Note that these settings will result in checkpoints taking a long time to complete, leading to additional disk requirements for journal data and slower recovery after an unclean shutdown. For this workload, they should avoid eviction entirely because the dirty working set will stay in cache.

Comment by Jim Van Fleet [ 21/Nov/16 ]

ran today – same parms, same-ish results 262966.39 ops/sec. Diagnostic data attached.

Comment by Michael Cahill (Inactive) [ 20/Nov/16 ]

jvf, thanks, it's good to hear that this setting helped. Can you upload the diagnostic.data from that run so I can see where the remaining 13% is going?

Comment by Jim Van Fleet [ 18/Nov/16 ]

I ran with that change and the result improved to 262908.22 ops/sec which is still about 13% down from 3.3.8. You are certainly on the right track! – just a little bit more. Can you suggest different trigger percentages to experiment with? Other parameters?

Comment by Michael Cahill (Inactive) [ 18/Nov/16 ]

lilianr@us.ibm.com, thank you for the data, I have been looking into this today.

This reduction in throughput is caused by some changes to limit how much of the WiredTiger cache is permitted to be dirty by default. In the 3.3.8 runs, much of the cache is dirty, leading to variable memory use and checkpoints taking a very long time (over 10 minutes each). The 3.4.0-rc4 runs here are behaving better in terms of memory use and checkpoints completing, but the stalls are unexpected and unacceptable.

If you are able to run further tests with 3.4.0 release candidates, can you please try starting MongoDB with the following setting:

mongod --wiredTigerEngineConfigString="eviction_dirty_trigger=95" ...

That should largely restore the 3.3.8 behavior: if you can run that test, please let me know how the results look.

I have reproduced this effect and will work to improve the behavior with the default settings. I will let you know when I have more information.

Comment by Lilian Romero [ 16/Nov/16 ]

The config file used is the default.
Here is portion of the vmstat for 3.4 rc3. As mentioned previously in version 3.3.8, cpu utilization is very stable where as in 3.4 rc3 there are high and lows. The attachment
shows the pstack for the low and the high. Note that in the low case over 90 threads are in wait from __wt_cache_eviction_worker where as in higher utilization there is only 3.

106  0      0 152681600 138432 51564928    0    0     8 49440 418499 1019938 51 12 37  0  0
108  0      0 152542144 138432 51686464    0    0     4 47808 420635 1026560 52 12 36  0  0
101  0      0 152421120 138432 51808192    0    0     0 54824 420483 1055053 51 12 36  0  0
112  0      0 152278016 138432 51908544    0    0     0 1281200 414543 1067763 52 12 36  0  0
77  6      0 152150336 138432 52009920    0    0     0 3136104 432092 1070823 51 13 35  1  0
100  0      0 152058240 138432 52107008    0    0     0 2584652 293156 510549 21  6 72  0  0
64  0      0 151957888 138432 52203840    0    0     0 673368 340965 716791 33 11 57  0  0
40  0      0 151799616 138432 52360256    0    0     0 36628 370159 820120 39 12 49  0  0
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
120  0      0 151639616 138432 52519616    0    0     0 32972 366290 830841 39 11 50  0  0
103  0      0 151487872 138432 52669248    0    0     0 35640 364529 822919 37 11 52  0  0
54  0      0 151355328 138432 52802240    0    0     0 35040 362368 813807 37 10 53  0  0
107  1      0 151219968 138432 52937792    0    0     0 31780 360528 813909 36 11 52  0  0
12  0      0 151118272 138432 53038784    0    0     0 23144 273532 639754 27  8 66  0  0
 4  0      0 151095936 138432 53062336    0    0     0  8632 181240 400825 12  4 84  0  0
 3  0      0 151087680 138432 53070336    0    0     0  8832 174087 334576 11  4 85  0  0
 4  0      0 151078336 138432 53079424    0    0     0  7836 181645 358191 10  4 86  0  0
 3  0      0 151069568 138432 53087808    0    0     0  8972 169224 321014 10  4 86  0  0

Comment by Lilian Romero [ 16/Nov/16 ]

The attachment contains the diagnostic data for version 3.3.8

Comment by Lilian Romero [ 15/Nov/16 ]

The attachment contains the diagnostics data for 3.4.cr3

Comment by Daniel Pasette (Inactive) [ 15/Nov/16 ]

Hi Lillian, thanks for the report. A couple questions:

  • What is the configuration you are running your mongodb with (either startup params or the config file used)
  • Can you attach the files from the diagnostic.data directory in the dbpath of a run with 3.3.8 and 3.4.0-rc3?
Generated at Thu Feb 08 04:14:00 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.