[SERVER-36429] $toUpper / $toLower should support locale-aware case mappings Created: 03/Aug/18  Updated: 06/Dec/22

Status: Backlog
Project: Core Server
Component/s: Aggregation Framework
Affects Version/s: None
Fix Version/s: None

Type: New Feature Priority: Minor - P4
Reporter: Andrew Shevchuk Assignee: Backlog - Query Optimization
Resolution: Unresolved Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Query Optimization
Participants:

 Description   

$toUpper / $toLower currently only have defined behavior for ASCII characters. Users should be able to perform case folding / case mapping for all Unicode code points, according to locale-specific rules. It looks like ICU has support for this: http://userguide.icu-project.org/transforms/casemappings.



 Comments   
Comment by Andrew Shevchuk [ 03/Aug/18 ]

Yes, locale-aware case folding would be nice.
Thanks.

Comment by David Storch [ 03/Aug/18 ]

Hi ashevchuk,

Thanks for filing this report! A collation is nothing more than a comparator between two strings. Therefore, a collation does not specify case mapping rules. Our documentation specifies that $toUpper only has well-defined behavior for ASCII characters (and same for $toLower). This is working as designed.

I am going to convert this into a feature request to permit locale-aware case folding in the MongoDB aggregation framework.

Best,
Dave

Generated at Thu Feb 08 04:43:04 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.