[SERVER-32711] WiredTiger scan performance for the insert benchmark Created: 16/Jan/18  Updated: 06/Dec/22

Status: Backlog
Project: Core Server
Component/s: WiredTiger
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Mark Callaghan Assignee: Backlog - Performance Team
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-32707 WiredTiger performance with the inser... Backlog
Assigned Teams:
Product Performance
Participants:

 Description   

This covers full scan performance during the insert benchmark. It is related to SERVER-32707 which documents problems during the load with the insert benchmark. The scan phase of the insert benchmark does a full scan per collection. Here I run the insert benchmark with 16 clients and either 16 collections (client per collection) or 1 collections (all clients share one collection). So the tests are either:

  • 16 clients each scanning a separate collection
  • 1 client scanning one collection

The test is run in four configurations:

  • inMemory-1 - cached database with 16 clients and 1 collection
  • inMemory-16 - cached database with 16 clients and 16 collections (collection per client)
  • ioBound-none - database larger than memory, 16 clients, no compression
  • ioBound-zlib - database larger than memory, 16 clients, zlib compression

While I tested nine different mongo.conf variations they all provide similar scan performance so I only share results for the first configuration.

The collection(s) has/have 3 secondary indexes in addition to the index on _id

This test does 5 rounds of full scans. The work done per round is:

  1. full scan of the index on _id. In some cases this query is slower because dirty pages will be flushed while the query is in progress
  2. full scan of the first secondary index
  3. full scan of the second secondary index
  4. full scan of the third secondary index
  5. full scan of the index on _id

The queries are written to do a full scan but return 0 rows. Alas, that hits a perf bug in MongoDB – see SERVER-31078. It looks like this makes such queries take ~1.5x longer than needed, but for the sake of this bug I will round up and claim it makes queries take 2X longer than needed.



 Comments   
Comment by Mark Callaghan [ 16/Jan/18 ]

This table lists aggregate query throughput in millions of rows scanned per second for each configuration. The first group is for the in-memory test with 16 clients and 16 collections. The second group is for the IO-bound test with 16 clients and 16 collections. The last group is for the in-memory test with one collection.

  • InnoDB gets between 10X and 20X more throughput on the concurrent in-memory scan
  • InnoDB gets between 3X and 8X more throughput on the concurrent IO-bound scans
  • WiredTiger gets the same scan throughput on the concurrent tests – both IO-bound and in-memory – because it is extremely CPU bound

1       2       3       4       5      configuration
 5.952   2.941   3.906   3.012   5.952  inMemory-16-WiredTiger
55.555  55.555  55.555  62.500  62.500  inMemory-16-InnoDB
-
 5.633   2.789   3.478   2.785   5.681  ioBound-16-WiredTiger
14.285  26.315  24.691  19.801  48.780  ioBound-16-InnoDB
-
1.612   0.415   0.533   0.424   1.851  inMemory-1

One of the metrics that I compute during this test is CPU overhead per row scanned and that value is 10X larger for WiredTiger than for InnoDB.

Generated at Thu Feb 08 04:31:04 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.