[CDRIVER-2159] Ignore non-ASCII when downcasing domain names Created: 10/May/17 Updated: 28/Oct/23 Resolved: 13/Jun/18 |
|
| Status: | Closed |
| Project: | C Driver |
| Component/s: | libmongoc, network |
| Affects Version/s: | None |
| Fix Version/s: | 1.11.0 |
| Type: | Task | Priority: | Minor - P4 |
| Reporter: | A. Jesse Jiryu Davis | Assignee: | Unassigned |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | neweng | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||
| Backwards Compatibility: | Fully Compatible | ||||
| Description |
|
SDAM requires us to downcase domain names of MongoDB servers: Our decision was to translate uppercase ASCII chars to lowercase and leave non-ASCII chars unchanged. The C Driver uses mongoc_lowercase for this. I'm concerned that a multibyte UTF-8 character could be corrupted by its algorithm:
It should probably use bson_utf8_next_char to advance through the string, instead of using "++". |
| Comments |
| Comment by Githook User [ 13/Jun/18 ] |
|
Author: {'username': 'spencemc', 'name': 'Spencer McKenney', 'email': 'spencermck@me.com'}Message: |
| Comment by Kevin Albertson [ 13/Jun/18 ] |
|
spencer.mckenney@10gen.com when you're ready let's go through the changes for this and the workflow for working on a ticket. |
| Comment by A. Jesse Jiryu Davis [ 08/May/18 ] |
|
Good point. Still, this code seems to work by accident. Wouldn't it be better to always iterate UTF-8 characters using bson_utf8_next_char instead of ++? |
| Comment by Kevin Albertson [ 08/May/18 ] |
|
From the man page of tolower:
Multi-byte UTF-8 characters consist of all bytes with a leading the leading bit set to 1. So I think those should get returned unaltered. |