[SERVER-31020] Sharding database creation is slow because of shard disk-usage statistics gathering Created: 11/Sep/17  Updated: 12/Dec/23

Status: Open
Project: Core Server
Component/s: Internal Code, Sharding
Affects Version/s: 3.4.4
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Anthony Brodard Assignee: Backlog - Catalog and Routing
Resolution: Unresolved Votes: 0
Labels: 12/12, ShardingRoughEdges, high-value
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-31083 Allow passing primary shard to "enabl... Closed
is related to SERVER-35431 rollback does not correct sizeStorer ... Backlog
Assigned Teams:
Catalog and Routing
Operating System: ALL
Steps To Reproduce:

use anthony110920175
db.test.insert({test:true})

Participants:
Case:
Story Points: 3

 Description   

Hi team,

We are using a 10 shards (Primary / Secondary / Arbiter) cluster to host around 20k databases.
We noticed that creating new databases take lot of time, around 10s :

2017-09-11T09:48:52.763+0000 I SHARDING [conn641] distributed lock 'anthony110920175' acquired for 'createDatabase', ts : 59b65c04bb2f983e7e106b2c
2017-09-11T09:49:02.438+0000 I SHARDING [conn641] Placing [anthony110920175] on: clust-1-sh1
2017-09-11T09:49:02.448+0000 I SHARDING [conn641] distributed lock with ts: 59b65c04bb2f983e7e106b2c' unlocked.

It looks like a

db.runCommand({listDatabases:1})

is launched on all shard, one by one, before creating the database on the config servers. This command take arount 1s per shard.
Adding the "nameOnly:true" option make the query almost instantaneous. Maybe it should be added when a mongos check database existency ?

Regards,
Anthony



 Comments   
Comment by John Moser [ 04/Nov/22 ]

Dear all

Do you guys have any update ?

We just experienced a massive slowdown when creating a new db (collection) with >10k databases/collections in place.

Thanks

John

Comment by Kaloian Manassiev [ 22/Sep/17 ]

Hi anthony@sendinblue.com,

Optimizing the shard utilization statistics gathering is a sizable task, which first needs to be designed and then scheduled to implement. Right now it is on the 'Backlog' which means it will at some point be weighed and prioritized against other server tasks. Because of this we cannot promise a particular release in which it will be available.

Sorry we couldn't give you a more definite timeline and please keep monitoring this ticket for further updates.

Best regards,
-Kal.

Comment by Anthony Brodard [ 21/Sep/17 ]

Thanks for this explanation Dimitri !

Do you have an ETA about the optimization itself ? Will it be included to a future 3.4 release, or only on 3.6 ?

Regards,
Anthony

Comment by Dmitry Agranat [ 13/Sep/17 ]

Hi Anthony,

Thank you for reporting this behavior.

Under the current design, we send listDatabases when creating a new database in order to check which shard has the least data on it currently. This is why we cannot use nameOnly:true instead because we need to get each shard size.

Here’s the code for the mongoS create command, which is used to create a new collection or view (we don’t have an explicit createDatabase command).

This calls createShardDatabase.

If the database doesn’t exist in the config metadata, then we call
ShardingCatalogClient::createDatabase. This takes a distributed lock, validates the new DB name, and then calls _selectShardForNewDatabase to pick a primary shard for the database.

_selectShardForNewDatabase in turn calls shardutil::retrieveTotalShardSize in order to determine the total data on each shard, so that it can create the new database on the smallest shard.

We have some ideas on about optimizing this. We will post them in this ticket once these ideas are finalized.

Thanks,
Dima

Generated at Thu Feb 08 04:25:45 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.