[SERVER-86332] Reformat shardCollection to make Unique and Collations Clearer Created: 06/Feb/24 Updated: 07/Feb/24 |
|
| Status: | Needs Scheduling |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Matt Panton | Assignee: | Backlog - Catalog and Routing |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Assigned Teams: |
Catalog and Routing
|
| Participants: |
| Description |
|
The shard collection command takes two options unique and collation that do not refer/modify the collection, but are rather options for modifying the shard key index. — For example if I create a collection:
I've set the default collation for the collection to nl(Dutch) and the default _id index has inherited that collation. db.nonDefaultCollationCollection.getIndexes() returns:
To shard the collection on a non-_id field I am required by the server to pass a simple locale document to shardCollection.
Running db.nonDefaultCollationCollection.getIndexes() now shows that the index on randomField has the default simple collation due to the lack of a collation document and is enforcing uniqueness
To enhance consistency with the create command and what is actually happening within the server when executing shardCollection the unique and collation options should be moved to an field that encapsulates both options much like the current timeseries field has multiple options currently. |
| Comments |
| Comment by Max Hirschhorn [ 07/Feb/24 ] |
|
The shardCollection command does more than create an index to support the shard key pattern if such an index doesn't already exist and the collection is empty. The shardCollection command also records the partitioning scheme on the config server. How data is partitioned must be aware of the collation because routing decisions involve comparing values which may be/contain strings and are thus needing to be compared with the collator for correctness. The requirement to run the shardCollection command with {collation: {locale: "simple"}} is due to the convention of commands which accept a collation (e.g. find, update, delete) implying to use the collection's default collation when the parameter is omitted. However due to PM-1930 not being complete the simple collation is the only option for partitioning a collection. Can more be said here on what is inconsistent with the create command? |