[SERVER-85346] Sharded collections permit unique indexes with non-"simple" collations, leading to uniqueness violations Created: 17/Jan/24  Updated: 18/Jan/24

Status: Backlog
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Max Hirschhorn Assignee: Backlog - Catalog and Routing
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Catalog and Routing
Operating System: ALL
Participants:
Story Points: 3

 Description   

This is a beginning of time bug (since MongoDB 3.4) from when collation was first introduced.

A unique index {key: {a: 1}, unique: true, collation: {locale: "en_US", strength: 2}} does not permit both of the documents {a: "YES"} and {a: "yes"} to be inserted. Yet sharded collections are always partitioned according to the "simple" collation independent of the collection's default collation. This means a partitioning scheme of {key: {a: 1}, collation: {locale: "simple"}} would treat these two shard key values as distinct and possibly place them on separate shards. In such a scenario, the unique indexes {key: {a: 1}, unique: true, collation: {locale: "en_US", strength: 2}} on each of the two shards would only contain one of these two shard key values though both documents can exist simultaneously in the whole sharded cluster. Therefore global uniqueness enforcement cannot be implied from local uniqueness enforcement.

The core problem is neither createIndexes or shardCollection account for the collation of unique indexes. In particular, ShardKeyPattern::isIndexUniquenessCompatible() accepts only the index key pattern as an input.


Generated at Thu Feb 08 06:57:31 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.