[DOCS-8512] Primary Shard of new Database is Shard with least data on it Created: 05/Aug/16  Updated: 30/Oct/23  Resolved: 29/Aug/16

Status: Closed
Project: Documentation
Component/s: manual
Affects Version/s: None
Fix Version/s: Server_Docs_20231030

Type: Task Priority: Major - P3
Reporter: Roy Rim Assignee: Ravind Kumar (Inactive)
Resolution: Done Votes: 0
Labels: sharding
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
Participants:
Days since reply: 7 years, 23 weeks, 6 days ago
Story Points: 1

 Description   

This question comes up once in a while. When a new database is created in a sharded cluster, which shard is chosen as the "primary shard". I have seen answers from support saying "round robin" and "arbitrary". The answer according to SERVER team is that the shard with the least amount of data on it is picked as the "primary shard" for the new database. At least since 2.0, unsure of earlier versions.

Our education clases m102 says its the first shard always. m202 says random.

Documentation doesn't mention this at all at docs.mongodb.com.



 Comments   
Comment by Ravind Kumar (Inactive) [ 01/Sep/16 ]

https://github.com/mongodb/docs/pull/2719

Comment by Githook User [ 01/Sep/16 ]

Author:

{u'username': u'rkumar-mongo', u'name': u'ravind', u'email': u'ravind.kumar@10gen.com'}

Message: DOCS-8512: Primary Shard Selection Process

Signed-off-by: kay <kay.kim@10gen.com>
Branch: master
https://github.com/mongodb/docs/commit/ecb772a6a827ad24182e1b362a075f396a8ce5e3

Comment by Ravind Kumar (Inactive) [ 29/Aug/16 ]

https://github.com/mongodb/docs/pull/2719

Comment by Roy Rim [ 05/Aug/16 ]

Thanks renctan

Comment by Randolph Tan [ 05/Aug/16 ]

roy.rim No. !isOK in this context would be something like network error contacting the shard or invalid shard name or corrupted response format, etc. candidateSizeStatus will only contain the raw value for the data size in the shard if there are no errors and does not know about how full it is. Come to think of it, this code should probably also take into account the shardSize setting...

Comment by Roy Rim [ 05/Aug/16 ]

renctan not sure what candidateSizeStatus stands for but I assume it means that there is space available. So regardless of how much space is available on the other shards if the first one has no space you are not allowed to create a new database?

If so I guess it makes sense since if any of your shards are full you probably have a problem.

Comment by Randolph Tan [ 05/Aug/16 ]

roy.rim Yes. And that also applies to the other shards as well

Comment by Roy Rim [ 05/Aug/16 ]

renctan the first few lines in https://github.com/mongodb/mongo/blob/r3.3.10/src/mongo/s/catalog/replset/sharding_catalog_client_impl.cpp#L286-L303:

auto candidateSizeStatus = shardutil::retrieveTotalShardSize(txn, candidateShardId);
    if (!candidateSizeStatus.isOK()) {
        return candidateSizeStatus.getStatus();
    }

So if candidateSizeStatus is not OK for the first shard then this will error out? Just trying to understand.

Roy

Comment by Randolph Tan [ 05/Aug/16 ]

ravind.kumar Mongos selects the primary shard for a new database by picking the first shard that has the least amount of data. This logic has not changed since v1.6. The only change was how we determine the data size in the shard. Before v3.0, we were using mmapSize and with the introduction of WT, mmapSize doesn't make sense anymore, so we changed it to use the totalSize field from listDatabase command response instead.

For reference:
v1.6: https://github.com/mongodb/mongo/blob/v1.6/s/shard.cpp#L232-L238
v3.3: https://github.com/mongodb/mongo/blob/r3.3.10/src/mongo/s/catalog/replset/sharding_catalog_client_impl.cpp#L286-L303

When is the primary shard chosen

Every time mongos creates a new database entry in config.databases.

Is there a single primary shard, or is it per database

per database

How is the primary shard chosen

See above.

Generated at Thu Feb 08 07:56:27 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.