[SERVER-32055] Improve multi thread performance for retryable writes Created: 21/Nov/17  Updated: 30/Oct/23  Resolved: 28/Nov/17

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 3.6.0-rc4
Fix Version/s: 3.6.1, 3.7.1

Type: Improvement Priority: Major - P3
Reporter: Randolph Tan Assignee: Randolph Tan
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Related
is related to SERVER-31845 WT performance regression with write ... Closed
Backwards Compatibility: Fully Compatible
Backport Requested:
v3.6
Sprint: Sharding 2017-12-04
Participants:

 Comments   
Comment by Githook User [ 30/Nov/17 ]

Author:

{'name': 'Randolph Tan', 'username': 'renctan', 'email': 'randolph@10gen.com'}

Message: SERVER-32055 Do not take ShardingState mutex when updating config.transactions

(cherry picked from commit b821d0c6a2c4fcbb3b8947e4969b48921f920897)
Branch: v3.6
https://github.com/mongodb/mongo/commit/aa196ccd7598335cd75344f46ff98d47b78c70d7

Comment by Githook User [ 28/Nov/17 ]

Author:

{'name': 'Randolph Tan', 'username': 'renctan', 'email': 'randolph@10gen.com'}

Message: SERVER-32055 Do not take ShardingState mutex when updating config.transactions
Branch: master
https://github.com/mongodb/mongo/commit/b821d0c6a2c4fcbb3b8947e4969b48921f920897

Comment by Randolph Tan [ 22/Nov/17 ]

It looks like there is a bottleneck in the ShardingState mutex. I tried a test patch where I skipped accessing the ShardingState inside the OpObserver if ns == config.transactions, I was able to get some gains. For example, before the fix, the mixed_insert throughput has a degradation of around 32.6% at 512 threads, with the patch, it was down to 17%. Without the patch the mixed_findOne has 16.7% degradation at 512 threads, and with the patch, it was down to 4%.

Here's some raw data:

Retry:

+--------------------------------+----------+--------------+
| "Test"                         | "Thread" | "Throughput" |
|--------------------------------+----------+--------------+
| "insert_vector_primary_retry"  |        1 | 95553.759319 |
| "insert_vector_secondary_load_ |        1 | 95553.759319 |
| "insert_vector_secondary_overa |        1 | 95550.574406 |
| "insert_vector_primary_retry"  |        8 | 445386.36742 |
| "insert_vector_secondary_load_ |        8 | 445386.36742 |
| "insert_vector_secondary_overa |        8 | 441294.81117 |
| "insert_vector_primary_retry"  |       16 | 460664.82609 |
| "insert_vector_secondary_load_ |       16 | 460664.82609 |
| "insert_vector_secondary_overa |       16 | 460652.03141 |
| "mixed_findOne_retry"          |        4 | 4296.7529800 |
| "mixed_insert_retry"           |        4 | 3279.2642226 |
| "mixed_update_retry"           |        4 | 2593.8110615 |
| "mixed_delete_retry"           |        4 | 2788.6286549 |
| "mixed_total_retry"            |        4 | 12958.456919 |
| "mixed_findOne_retry"          |       64 | 18500.464798 |
| "mixed_insert_retry"           |       64 | 10769.051201 |
| "mixed_update_retry"           |       64 | 7124.7827193 |
| "mixed_delete_retry"           |       64 | 7600.4217837 |
| "mixed_total_retry"            |       64 | 43994.720503 |
| "mixed_findOne_retry"          |      128 | 18045.451520 |
| "mixed_insert_retry"           |      128 | 10477.718153 |
| "mixed_update_retry"           |      128 | 6759.1522685 |
| "mixed_delete_retry"           |      128 | 6968.6623589 |
| "mixed_total_retry"            |      128 | 42250.984301 |
| "mixed_findOne_retry"          |      256 | 21005.294244 |
| "mixed_insert_retry"           |      256 | 9148.5429479 |
| "mixed_update_retry"           |      256 | 5909.3570382 |
| "mixed_delete_retry"           |      256 | 5305.6420372 |
| "mixed_total_retry"            |      256 | 41368.836268 |
| "mixed_findOne_retry"          |      512 | 25782.145266 |
| "mixed_insert_retry"           |      512 | 7234.1482837 |
| "mixed_update_retry"           |      512 | 4911.2572234 |
| "mixed_delete_retry"           |      512 | 4853.1049996 |
| "mixed_total_retry"            |      512 | 42780.655773 |
+--------------------------------+----------+--------------+

Baseline:

+--------------------------------+----------+--------------+
| "Test"                         | "Thread" | "Throughput" |
|--------------------------------+----------+--------------+
| "insert_vector_primary"        |        1 | 105452.04048 |
| "insert_vector_secondary_load_ |        1 | 105452.04048 |
| "insert_vector_secondary_overa |        1 | 105449.11144 |
| "insert_vector_primary"        |        8 | 482743.91835 |
| "insert_vector_secondary_load_ |        8 | 482743.91835 |
| "insert_vector_secondary_overa |        8 | 482730.50984 |
| "insert_vector_primary"        |       16 | 504928.75593 |
| "insert_vector_secondary_load_ |       16 | 504928.75593 |
| "insert_vector_secondary_overa |       16 | 504903.51286 |
| "mixed_findOne"                |        4 | 4207.7093058 |
| "mixed_insert"                 |        4 | 3756.0677257 |
| "mixed_update"                 |        4 | 2828.8017525 |
| "mixed_delete"                 |        4 | 3075.4222738 |
| "mixed_total"                  |        4 | 13868.001058 |
| "mixed_findOne"                |       64 | 16917.855516 |
| "mixed_insert"                 |       64 | 13708.527699 |
| "mixed_update"                 |       64 | 8001.3882321 |
| "mixed_delete"                 |       64 | 8739.0396047 |
| "mixed_total"                  |       64 | 47366.811052 |
| "mixed_findOne"                |      128 | 17290.481328 |
| "mixed_insert"                 |      128 | 13169.596474 |
| "mixed_update"                 |      128 | 7829.0958328 |
| "mixed_delete"                 |      128 | 7867.5564947 |
| "mixed_total"                  |      128 | 46156.730130 |
| "mixed_findOne"                |      256 | 19416.284754 |
| "mixed_insert"                 |      256 | 11748.420935 |
| "mixed_update"                 |      256 | 7047.1005159 |
| "mixed_delete"                 |      256 | 6834.1968022 |
| "mixed_total"                  |      256 | 45046.003007 |
| "mixed_findOne"                |      512 | 26835.195421 |
| "mixed_insert"                 |      512 | 8736.6362651 |
| "mixed_update"                 |      512 | 5027.0871146 |
| "mixed_delete"                 |      512 | 5054.3446568 |
| "mixed_total"                  |      512 | 45653.263458 |
+--------------------------------+----------+--------------+

Generated at Thu Feb 08 04:29:02 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.