[SERVER-31020] Sharding database creation is slow because of shard disk-usage statistics gathering Created: 11/Sep/17 Updated: 12/Dec/23 |
|
| Status: | Open |
| Project: | Core Server |
| Component/s: | Internal Code, Sharding |
| Affects Version/s: | 3.4.4 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Anthony Brodard | Assignee: | Backlog - Catalog and Routing |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | 12/12, ShardingRoughEdges, high-value | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||
| Assigned Teams: |
Catalog and Routing
|
||||||||||||
| Operating System: | ALL | ||||||||||||
| Steps To Reproduce: |
|
||||||||||||
| Participants: | |||||||||||||
| Case: | (copied to CRM) | ||||||||||||
| Story Points: | 3 | ||||||||||||
| Description |
|
Hi team, We are using a 10 shards (Primary / Secondary / Arbiter) cluster to host around 20k databases.
It looks like a
Adding the "nameOnly:true" option make the query almost instantaneous. Maybe it should be added when a mongos check database existency ? Regards, |
| Comments |
| Comment by John Moser [ 04/Nov/22 ] |
|
Dear all Do you guys have any update ? We just experienced a massive slowdown when creating a new db (collection) with >10k databases/collections in place. Thanks John |
| Comment by Kaloian Manassiev [ 22/Sep/17 ] |
|
Optimizing the shard utilization statistics gathering is a sizable task, which first needs to be designed and then scheduled to implement. Right now it is on the 'Backlog' which means it will at some point be weighed and prioritized against other server tasks. Because of this we cannot promise a particular release in which it will be available. Sorry we couldn't give you a more definite timeline and please keep monitoring this ticket for further updates. Best regards, |
| Comment by Anthony Brodard [ 21/Sep/17 ] |
|
Thanks for this explanation Dimitri ! Do you have an ETA about the optimization itself ? Will it be included to a future 3.4 release, or only on 3.6 ? Regards, |
| Comment by Dmitry Agranat [ 13/Sep/17 ] |
|
Hi Anthony, Thank you for reporting this behavior. Under the current design, we send listDatabases when creating a new database in order to check which shard has the least data on it currently. This is why we cannot use nameOnly:true instead because we need to get each shard size. Here’s the code for the mongoS create command, which is used to create a new collection or view (we don’t have an explicit createDatabase command). This calls createShardDatabase. If the database doesn’t exist in the config metadata, then we call _selectShardForNewDatabase in turn calls shardutil::retrieveTotalShardSize in order to determine the total data on each shard, so that it can create the new database on the smallest shard. We have some ideas on about optimizing this. We will post them in this ticket once these ideas are finalized. Thanks, |