[SERVER-32619] Change shardCollection to allow replacing shard keys of single chunk collections Created: 09/Jan/18  Updated: 06/Dec/22  Resolved: 02/Nov/22

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Randolph Tan Assignee: [DO NOT USE] Backlog - Sharding Team
Resolution: Won't Do Votes: 0
Labels: MaxH, pm-1051-legacy-tickets
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Sharding
Participants:

 Description   

Since unsharded collections are always assigned _id as their shard key, shardCollection will behave like "resharding" the collection with a new shard key, but only if the collection has only one chunk. Additionally, it will set “allowSplit: true” for the collection.

As in 3.6, it will be possible to call shardCollection for a nonexistent collection only to continue to support mapReduce with output to a new sharded collection. However, in this case, shardCollection will create the collection with the same visibility rules as if createCollection had been called explicitly by the user. That is, it is legal for the collection to be dropped between when the createCollection logic and shardCollection logic is executed.

_configsvrShardCollection logic

  1. ScopedDistLock dbLock(dbName)
  2. ScopedDistLock collLock(collName)
  3. If config.collections does not have an entry for the collection, drop the distlocks and execute the createCollection logic once. Then re-obtain the distlocks and check again, and if the entry still doesn’t exist, return ConflictingOperationInProgress.
  4. If config.collections shows the collection is already sharded with the same key, return OK
  5. If config.collections has a different key, check if the collection only has a single chunk, otherwise return error.
  6. Ensure the primary shard and shard that owns the (single) chunk have an index on the requested shard key (create the indexes if necessary)
  7. Replace the single chunk with a new chunk with the major version bumped while preserving the same epoch and UUID.
  8. Update the config.collections entry’s “key” field and set allowSplit to true

Whenever a node does an incremental refresh, it needs to check whether the shardKey field changed and make sure the new ChunkManager/CollectionMetadata is assigned the new shard key (today, the old shard key is blindly copied, because shard keys are never expected to change).



 Comments   
Comment by Max Hirschhorn [ 02/Nov/22 ]

There will be a separate project for having all collections be sharded upon creation with a single chunk range {_id: MinKey} to {_id: MaxKey}.

Generated at Thu Feb 08 04:30:47 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.