[SERVER-32707] WiredTiger performance with the insert benchmark Created: 15/Jan/18  Updated: 06/Dec/22

Status: Backlog
Project: Core Server
Component/s: WiredTiger
Affects Version/s: 3.6.0
Fix Version/s: None

Type: Improvement Priority: Minor - P4
Reporter: Mark Callaghan Assignee: Backlog - Performance Team
Resolution: Unresolved Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File metrics.2017-12-28T00-13-52Z-00000     File metrics.2017-12-28T00-14-02Z-00000     File metrics.2017-12-28T07-22-34Z-00000     File metrics.2017-12-28T14-27-34Z-00000     File metrics.2017-12-28T21-32-34Z-00000     File metrics.2017-12-29T04-31-37Z-00000     File metrics.2017-12-29T04-37-34Z-00000     File metrics.2017-12-29T11-42-34Z-00000     File metrics.2017-12-29T15-15-31Z-00000     File metrics.2017-12-29T19-07-34Z-00000     File metrics.interim     File metrics.interim     File metrics.interim     File metrics.interim    
Issue Links:
Related
is related to SERVER-32711 WiredTiger scan performance for the i... Backlog
Assigned Teams:
Product Performance
Participants:

 Description   

I ran the insert benchmark for WiredTiger in MongoDB 3.6.0 and summarize the problems I see here. For more details on the insert benchmark including a link to the source see this link. The overview of the insert benchmark is:
1) load collection(s) using N clients (N=16 in this case). Measure the average insert rate and response time distributions.
2) do a full scan of each collection. Compute time for the scan.
3) use N writer clients and N reader clients (N=16 in this case). Each writer is rate limited to 1000 inserts/second and determine how fast the reader clients can do short range scans. Compute average query rate and query response time distributions.
4) same as #3 but the rate limit is 100/second/writer

I have seen this in all of the major MongoDB releases (3.0, 3.2, 3.4) and now 3.6.0. This time I archived diagnostic data.

I tried 9 mongo.conf variations for 4 test configurations:

  • inMemory-1 - cached database with 16 clients and 1 collection
  • inMemory-16 - cached database with 16 clients and 16 collections (collection per client)
  • ioBound-none - database larger than memory, 16 clients, no compression
  • ioBound-zlib - database larger than memory, 16 clients, zlib compression

The test server has 24 cores and 48 HW threads. Hyperthreading is enabled. For the in-memory benchmarks the server has 256gb of RAM. For the IO-bound benchmarks the server has 50gb of RAM. The server also has 2 or 3 fast PCIe-based SSDs.

The template for mongo.conf for this in-memory benchmarks is below and comments at the end explain the 9 variations:

processManagement:
  fork: true
systemLog:
  destination: file
  path: /data/mysql/mmon360/log
  logAppend: true
storage:
  syncPeriodSecs: 600
  dbPath: /data/mysql/mmon360/data
  journal:
    enabled: true
 
operationProfiling.slowOpThresholdMs: 2000
replication.oplogSizeMB: 4000
 
storage.wiredTiger.collectionConfig.blockCompressor: none
storage.wiredTiger.engineConfig.journalCompressor: none
storage.wiredTiger.engineConfig.cacheSizeGB: 180
 
storage.wiredTiger.engineConfig.configString: "eviction_dirty_target=60, eviction_dirty_trigger=80"
 
# storage.wiredTiger.engineConfig.configString:
# eviction_target=90,eviction_trigger=95,eviction_dirty_target=85,eviction=(threads_min=4,threads_max=8)
# eviction_target=X
# eviction_trigger=X
# eviction_dirty_target=X
# eviction_dirty_trigger=X
# eviction=(threads_min=4,threads_max=4)
# checkpoint=(log_size=1GB)
 
# 1  - syncPeriodSecs=60, oplogSizeMB=4000
# 2  - syncPeriodSecs=60, oplogSizeMB=16000
# 3  - syncPeriodSecs=600, oplogSizeMB=16000
# 4  - syncPeriodSecs=60, oplogSizeMB=16000, checkpoint=1g
# 5  - syncPeriodSecs=600, oplogSizeMB=16000, checkpoint=1g
# 6  - syncPeriodSecs=600, oplogSizeMB=4000
# 7  - syncPeriodSecs=600, oplogSizeMB=4000, eviction_dirty_target=20, eviction_dirty_trigger=40
# 8  - syncPeriodSecs=600, oplogSizeMB=4000, eviction=(threads_min=4,threads_max=8)
# 9  - syncPeriodSecs=600, oplogSizeMB=4000, eviction_dirty_target=60, eviction_dirty_trigger=80

The mongo.conf template for the IO-bound tests is below. The big difference from the configuration above is cacheSizeGB is reduces from 180 to 10. I won't paste mongo.conf for the test that used compression, but the change from the template below is obvious.

processManagement:
  fork: true
systemLog:
  destination: file
  path: /data/mysql/mmon360/log
  logAppend: true
storage:
  syncPeriodSecs: 600
  dbPath: /data/mysql/mmon360/data
  journal:
    enabled: true
 
operationProfiling.slowOpThresholdMs: 2000
replication.oplogSizeMB: 4000
 
storage.wiredTiger.collectionConfig.blockCompressor: none
storage.wiredTiger.engineConfig.journalCompressor: none
storage.wiredTiger.engineConfig.cacheSizeGB: 10
 
storage.wiredTiger.engineConfig.configString: "eviction_dirty_target=60, eviction_dirty_trigger=80"
 
# storage.wiredTiger.engineConfig.configString:
# eviction_target=90,eviction_trigger=95,eviction_dirty_target=85,eviction=(threads_min=4,threads_max=8)
# eviction_target=X
# eviction_trigger=X
# eviction_dirty_target=X
# eviction_dirty_trigger=X
# eviction=(threads_min=4,threads_max=4)
# checkpoint=(log_size=1GB)
 
# 1  - syncPeriodSecs=60, oplogSizeMB=4000
# 2  - syncPeriodSecs=60, oplogSizeMB=16000
# 3  - syncPeriodSecs=600, oplogSizeMB=16000
# 4  - syncPeriodSecs=60, oplogSizeMB=16000, checkpoint=1g
# 5  - syncPeriodSecs=600, oplogSizeMB=16000, checkpoint=1g
# 6  - syncPeriodSecs=600, oplogSizeMB=4000
# 7  - syncPeriodSecs=600, oplogSizeMB=4000, eviction_dirty_target=20, eviction_dirty_trigger=40
# 8  - syncPeriodSecs=600, oplogSizeMB=4000, eviction=(threads_min=4,threads_max=8)
# 9  - syncPeriodSecs=600, oplogSizeMB=4000, eviction_dirty_target=60, eviction_dirty_trigger=80



 Comments   
Comment by Mark Callaghan [ 16/Jan/18 ]

The data below has metrics from vmstat and iostat. Some are normalized by the insert rate. This makes it easy to see why the average insert rates are better for InnoDB than for WiredTiger:

  • ips.av - average insert rate
  • r/i - iostat reads per insert --> iostat r/s divided by ips.av
  • rkb/i, wkb/i - KB read from and written to storage / insert
  • Mcpu/i - CPU overhead / insert using vmstat us and sy columns
  • size - database size in GB at test end
  • rss - mongod/mysqld process size (RSS) in GB at test end
  • r/s - iostat r/s
  • rmb/s, wmb/s - iostat read & write MB/s
  • cpu - vmstat us + sy columns

These are the metrics for inMemory-16 with configuration 1 for WiredTiger.

  • InnoDB has a better average insert rate (ips.av)
  • WiredTiger uses ~1.5X more CPU/insert
  • InnoDB writes ~2X more to storage per insert (wkb/i), so WiredTiger is more efficient
  • but InnoDB sustains more than 3X the write rate to storage (wmb/s) which supports a higher average insert rate

ips.av  wkb/i   Mcpu/i  size    rss     r/s     rmb/s   wmb/s   cpu     engine
157729   2.43    281    108      71.0    4      0       383.5   44.4    WiredTiger-3.6.0
230309   5.31    178     98     106.8    0      0      1223.1   41.1    InnoDB-5.7.17

These are the metrics for ioBound-none with configuration 1 for WiredTiger

  • the insert rate (ips.av) is almost 4X better with InnoDB
  • storage reads per insert (r/i) and KB read per insert (rkb/i) are ~20X larger for WiredTigerThis likely explains the difference in performance
  • storage KB written per insert (wkb/i) is ~3X larger for WiredTiger
  • CPU per insert (Mcpu/i) is ~4X larger for WiredTiger
  • InnoDB sustains ~2X the write rate to storage (wmb/s). Perhaps it is more effective at scheduling writes, but it also benefits from doing less read IO, which saves capacity for more writes
  • In this test there are 500M rows in InnoDB and 250M docs in WiredTiger, so WiredTiger uses ~2.4X more space than InnoDB per row/doc

ips.av   r/i    rkb/i   wkb/i   Mcpu/i  size    rss     r/s     rmb/s   wmb/s   cpu     engine
 14006   2.06   30.77   39.31   2055    491      9.1    28896   431.0   550.5   28.8    WiredTiger-3.6.0
 62332   0.09    1.45   14.55    553    402     40.0     5659    90.5   906.7   34.5    InnoDB-5.7.17

Comment by Mark Callaghan [ 16/Jan/18 ]

The uploaded diagnostic.data files are collected across all of the tests – load, scan, read-write

Comment by Mark Callaghan [ 16/Jan/18 ]

For the ioBound-none test and configuration 1
metrics.2017-12-28T07-22-34Z-00000 metrics.2017-12-28T21-32-34Z-00000 metrics.2017-12-29T04-37-34Z-00000 metrics.2017-12-28T14-27-34Z-00000 metrics.2017-12-28T00-14-02Z-00000 metrics.2017-12-29T11-42-34Z-00000 metrics.interim metrics.2017-12-29T19-07-34Z-00000

Comment by Mark Callaghan [ 16/Jan/18 ]

For the inMemory-16 test and configuration 9
metrics.2017-12-29T15-15-31Z-00000 metrics.interim

Comment by Mark Callaghan [ 16/Jan/18 ]

For the inMemory-16 test and configuration 7
metrics.2017-12-29T04-31-37Z-00000 metrics.interim

Comment by Mark Callaghan [ 16/Jan/18 ]

For the inMemory-16 test and configuration 1
metrics.2017-12-28T00-13-52Z-00000 metrics.interim

Comment by Ramon Fernandez Marina [ 16/Jan/18 ]

Hi mdcallag, if you still have the contents of the diagnostic.data directory for your tests, will you please upload them to the ticket?

Thanks,
Ramón.

Comment by Mark Callaghan [ 15/Jan/18 ]

Using my helper script a typical command line to run all of the tests (load, scan, read-write) is:

# set this to "no" for collection per client and "yes" for 1 collection
only1=...
 
# number of rows
nr=...
 
# string to match storage device in iostat output
dev=...
 
bash iq.sh mongo "" /path/to/mongo /path/to/data/directory $dev 1 16 yes no $only1 0 no $nr no

Comment by Mark Callaghan [ 15/Jan/18 ]

For the ioBound-zlib test (IO-bound, 16 clients, client per collection, zlib compression). The average insert rates per configuration are:
1 - 14036.8
2 - 13897.3
3 - 15303.1
4 - 13727.1
5 - 14249.7
6 - 15298.8
7 - 16122.1
8 - 17082.6
9 - still running

And the worst-case response time in seconds per configuration:
1 - 2.854469
2 - 2.915120
3 - 3.175969
4 - 2.797139
5 - 3.282094
6 - 3.026273
7 - 6.729078
8 - 3.309463
9 - still running

I won't compare this with InnoDB because InnoDB compression isn't ready for prime time.

Comment by Mark Callaghan [ 15/Jan/18 ]

I didn't test the IO-bound configuration with 1 collection and 16 concurrent clients because that takes too long to get results for the nine variations for mongo.conf. From previous tests the average insert rate for WiredTiger drops from ~15,000/second to ~3500/second in the test that uses 1 collection.

For the IO-bound setup with 1 collection and WiredTiger in MongoDB 3.4.6 I get

  • 3542 inserts/second during the load
  • worst-case stall time of ~11 seconds
Comment by Mark Callaghan [ 15/Jan/18 ]

For ioBound-none (io-bound, no compression)

Average insert rate per configuration
1 - 14006.0
2 - 13888.2
3 - 15302.1
4 - 13634.2
5 - 14255.3
6 - 15252.1
7 - 16124.2
8 - 17138.6
9 - 16870.9

Worst-case response time for an insert during the load per configuration:
1 - 2.609088
2 - 2.924722
3 - 3.389994
4 - 2.879052
5 - 2.910173
6 - 3.157802
7 - 1.989234
8 - 2.964498
9 - 2.822281

For comparison, InnoDB in MySQL 5.7 gets

  • ~60k inserts/second
  • worst case response time for an insert is ~0.8 seconds

I assume the worst-case stalls here are better than for the in-memory tests because the in-memory tests sustain much higher average insert rates, so there is more stress. My comparison with InnoDB isn't done to cast FUD at WiredTiger. We spent a long time fixing InnoDB stalls from workloads like this. This is a hard problem.

Comment by Mark Callaghan [ 15/Jan/18 ]

Next up is inMemory-16 (in-memory, 16 clients, collection per client). The results here are similar to the results for inMemory-1.

Average insert rates during the load per configuration. Configuration 9 has the best rate:
1 - 113173.3
2 - 114678.8
3 - 169836.9
4 - 73855.2
5 - 86535.1
6 - 170765.0
7 - 194855.8
8 - 165562.9
9 - 240847.7

This is the worst-case response time in seconds per insert during the load. Configuration 9 has the best of the worst-cases, about 7 seconds.
1 - 42.269483
2 - 31.099483
3 - 17.184345
4 - 9.068860
5 - 33.222387
6 - 12.693424
7 - 173.216546
8 - 8.277559
9 - 7.319484

Comment by Mark Callaghan [ 15/Jan/18 ]

First up is inMemory-1 (in-memory, 1 collection, 16 clients). This is the average insert rate per configuration:
1- 42265
2 - 51335
3 - 45872
4 - 53844
5 - 54289
6 - 46117
7 - 103220
8 - 49761
9 - 205086

And the worst-case response time per insert in seconds during the load. The best configuration suffers a stall for at least 7 seconds.
1 - 42.269483
2 - 31.099483
3 - 17.184345
4 - 9.068860
5 - 33.222387
6 - 12.693424
7 - 173.216546
8 - 8.277559
9 - 7.319484

For comparison with modern InnoDB (5.7, 8.0), although the MySQL tests insert 500m rows while I limit MongoDB to 250m because it uses much more space:

  • the worst-case load stalls are about 0.1 seconds
  • the average insert rates are 220k to 230k per second

So with a good configuration, WiredTiger can get insert rates similar to InnoDB but stalls for WiredTiger are much worse.

Generated at Thu Feb 08 04:31:03 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.