[SERVER-43773] ShardingTest should run the startup procedure for all of its ReplSetTest shard instances in parallel Created: 02/Oct/19 Updated: 29/Oct/23 Resolved: 27/Nov/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication, Sharding |
| Affects Version/s: | None |
| Fix Version/s: | 4.3.3 |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | William Schultz (Inactive) | Assignee: | William Schultz (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||
| Sprint: | Repl 2019-11-18, Repl 2019-12-02 | ||||||||||||||||||||
| Participants: | |||||||||||||||||||||
| Linked BF Score: | 14 | ||||||||||||||||||||
| Description |
|
Currently, ShardingTest calls startSet on each of its shard ReplSetTest instances serially. This means that it is not possible to start up processes from multiple shards at the same time. To make this startup process faster when there are many shards, ShardingTest can start all ReplSetTest instances at the same time, without waiting for one to complete before moving on to the next one. It can then wait for all shard startup procedures to finish. This will allow the startup of all shard ReplSetTest instances to proceed in parallel. |
| Comments |
| Comment by William Schultz (Inactive) [ 09/Dec/19 ] | ||||
|
We can also see how the profile of the "startupAndInitiate" metric improved: Note that the "num_nodes" metric only accounts for the total number of shard nodes, and does not take into account the size of the config server replica set, which defaults to 3 nodes but may use less or more nodes in some cases. I think that the performance improvement here is not particularly dramatic because initiation is actually the slowest part when setting up a ReplSetTest. | ||||
| Comment by William Schultz (Inactive) [ 27/Nov/19 ] | ||||
|
On an idle RHEL 6.2 spawn host, we can see the following performance metrics for the ShardingTest control tests running with a single config server node after these changes:
this scale factor of (869ms/816ms)=1.064 is well within the 1.5x target. | ||||
| Comment by Githook User [ 27/Nov/19 ] | ||||
|
Author: {'name': 'William Schultz', 'username': 'will62794', 'email': 'william.schultz@mongodb.com'}Message: | ||||
| Comment by Githook User [ 26/Nov/19 ] | ||||
|
Author: {'name': 'William Schultz', 'username': 'will62794', 'email': 'william.schultz@mongodb.com'}Message: Revert " This reverts commit 72845828cdac26031d66f18ef7e7a4e108d3d178. | ||||
| Comment by Githook User [ 26/Nov/19 ] | ||||
|
Author: {'email': 'william.schultz@mongodb.com', 'name': 'William Schultz', 'username': 'will62794'}Message: | ||||
| Comment by Githook User [ 26/Nov/19 ] | ||||
|
Author: {'name': 'William Schultz', 'username': 'will62794', 'email': 'william.schultz@mongodb.com'}Message: | ||||
| Comment by William Schultz (Inactive) [ 03/Oct/19 ] | ||||
|
Yes, that is the idea. It should generally be similar to Judah's POC from | ||||
| Comment by Kaloian Manassiev [ 03/Oct/19 ] | ||||
|
Are you guys planning to make the startSet/stopSet methods asynchronous in order to allow overlapping multiple of them? |