[SERVER-37535] createCollection command needs to add catalog metadata in sharded cluster Created: 09/Oct/18  Updated: 06/Dec/22  Resolved: 02/Nov/22

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Misha Tyulenev Assignee: [DO NOT USE] Backlog - Sharding Team
Resolution: Won't Do Votes: 0
Labels: MaxH, pm-1051-legacy-tickets
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-37905 dropCollection command clears metadat... Closed
Assigned Teams:
Sharding
Participants:

 Description   

The create command in sharded cluster first arrives to mongos, and mongos sends it to the config server primary.
Config server, in turn, acquires dist locks and creates catalog metadata and sends the create command to a primary shard for this database.

The createCollection command in sharded cluster will add catalog metadata that will be enough to route the commands to the node that has the collection's data. As such the difference between sharded and the unsharded collections will be in the presence of the sharded key. sharded key applies several restrictions on the data in the collection and on the access patterns. On the other hand it allows to shard the data i.e. split it into multiple chunks. The unsharded collection on the other hand do not have sharded key and hence can not be sharded. To reuse the catalog machinery built for using with sharded collections the sharding key piece will be ignored but the collection itself will be marked as sharded:false per SERVER-37572.
the net effect of the new createCollectionis similar to calling in MongoDB 4.0
createCollection
enableSharding
shardCollection
with the exception that the collection is still not sharded i.e. its only chunk is not splittable.

Implementation

The method creating a collection in the sharded cluster is ShardingCatalogManager::createCollection . It sends the create command to the primary shard to add a collection to the storage. In addition this method needs to

  1. add config.chunks entry
    besides usual defaults for a new chunk it needs to set a key pattern for the shard key. This pattern will be "_id"
  2. add config.collections entry
    as the config.databases has already been added the remaining step is adding a config.collections metadata with the _sharded set to false to indicate that this metadata represents unsharded collection

Test

  1. Test that createCollection adds an entry to config.collections and this entry has _sharded set to false
  2. Test that createCollection adds document to config.chunks


 Comments   
Comment by Max Hirschhorn [ 02/Nov/22 ]

There will be a separate project for having all collections be sharded upon creation with a single chunk range {_id: MinKey} to {_id: MaxKey}.

Generated at Thu Feb 08 04:46:16 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.