[SERVER-36429] $toUpper / $toLower should support locale-aware case mappings Created: 03/Aug/18 Updated: 06/Dec/22 |
|
| Status: | Backlog |
| Project: | Core Server |
| Component/s: | Aggregation Framework |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | New Feature | Priority: | Minor - P4 |
| Reporter: | Andrew Shevchuk | Assignee: | Backlog - Query Optimization |
| Resolution: | Unresolved | Votes: | 1 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Assigned Teams: |
Query Optimization
|
| Participants: |
| Description |
|
$toUpper / $toLower currently only have defined behavior for ASCII characters. Users should be able to perform case folding / case mapping for all Unicode code points, according to locale-specific rules. It looks like ICU has support for this: http://userguide.icu-project.org/transforms/casemappings. |
| Comments |
| Comment by Andrew Shevchuk [ 03/Aug/18 ] |
|
Yes, locale-aware case folding would be nice. |
| Comment by David Storch [ 03/Aug/18 ] |
|
Hi ashevchuk, Thanks for filing this report! A collation is nothing more than a comparator between two strings. Therefore, a collation does not specify case mapping rules. Our documentation specifies that $toUpper only has well-defined behavior for ASCII characters (and same for $toLower). This is working as designed. I am going to convert this into a feature request to permit locale-aware case folding in the MongoDB aggregation framework. Best, |