[CSHARP-27] Text would be broken when receive string value with unicode chars Created: 11/Mar/10 Updated: 15/Mar/10 Resolved: 15/Mar/10 |
|
| Status: | Closed |
| Project: | C# Driver |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Jeffrey Zhao | Assignee: | Sam Corder |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
mono on Mac OS X (Snow Leopard), MS CLR (v2.0 & 4.0 RC) |
||
| Description |
|
Issue: Text broken when receive string value with unicode chars. Reason: There's bug in BsonReader.GetString(int length) method. The method uses a 128-bytes buffer to receive string value. Each 128 bytes would be converted to utf-8 string seperately, but it may be only half of an utf-8 char at the end. So the text would be broken when contains unicode chars. Hot to fix: Put the complete data received in an StreamMemory before converting to utf-8 string. |
| Comments |
| Comment by Sam Corder [ 15/Mar/10 ] |
|
Fixed for both fixed length and variable length strings now. |
| Comment by Sam Corder [ 13/Mar/10 ] |
|
Forgot to add the code to the part that reads a known string length. |
| Comment by Sam Corder [ 13/Mar/10 ] |
|
Fixed in http://github.com/samus/mongodb-csharp/commit/0e73a20d5cf20190dc210eb7f18089d60d610948. This will be in the 0.82. Thanks for making me learn more about UTF-8 than I ever wanted to. |
| Comment by Steve Wagner [ 11/Mar/10 ] |
|
Can you provide an example unicode string and paste it as base64? With that i could integrate it into the test suite. |