[SERVER-32711] WiredTiger scan performance for the insert benchmark Created: 16/Jan/18 Updated: 06/Dec/22 |
|
| Status: | Backlog |
| Project: | Core Server |
| Component/s: | WiredTiger |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Mark Callaghan | Assignee: | Backlog - Performance Team |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Assigned Teams: |
Product Performance
|
||||||||
| Participants: | |||||||||
| Description |
|
This covers full scan performance during the insert benchmark. It is related to SERVER-32707 which documents problems during the load with the insert benchmark. The scan phase of the insert benchmark does a full scan per collection. Here I run the insert benchmark with 16 clients and either 16 collections (client per collection) or 1 collections (all clients share one collection). So the tests are either:
The test is run in four configurations:
While I tested nine different mongo.conf variations they all provide similar scan performance so I only share results for the first configuration. The collection(s) has/have 3 secondary indexes in addition to the index on _id This test does 5 rounds of full scans. The work done per round is:
The queries are written to do a full scan but return 0 rows. Alas, that hits a perf bug in MongoDB – see SERVER-31078. It looks like this makes such queries take ~1.5x longer than needed, but for the sake of this bug I will round up and claim it makes queries take 2X longer than needed. |
| Comments |
| Comment by Mark Callaghan [ 16/Jan/18 ] | ||||||||
|
This table lists aggregate query throughput in millions of rows scanned per second for each configuration. The first group is for the in-memory test with 16 clients and 16 collections. The second group is for the IO-bound test with 16 clients and 16 collections. The last group is for the in-memory test with one collection.
One of the metrics that I compute during this test is CPU overhead per row scanned and that value is 10X larger for WiredTiger than for InnoDB. |