[CSHARP-694] Provide some way for the C# driver to be more lenient about UTF8 validity Created: 01/Mar/13  Updated: 03/Jul/14  Resolved: 19/Mar/13

Status: Closed
Project: C# Driver
Component/s: None
Affects Version/s: 1.7.1
Fix Version/s: 1.8

Type: New Feature Priority: Major - P3
Reporter: Robert Stam Assignee: Robert Stam
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to PYTHON-721 BSON Errors with invalid utf8 strings. Closed
is related to JAVA-1305 Ensure that an exception is thrown if... Closed

 Description   

The C# driver sets .NET's throwOnInvalidBytes argument to true when constructing a UTF8Encoding object. This results in exceptions being thrown when attempting to decode invalid UTF8 returned from the server.

This is generally a good thing, so this should remain the default, but there are cases where other drivers (or mongoimport) create and store UTF8 that is not 100% UTF8 compliant and it would be useful to have a setting that tells the C# driver to be more tolerant of invalid UTF8 returned from the server.

Note that round tripping a document that contains invalid UTF8 would then result in the document changing slightly, since the invalid UTF8 characters would be replaced by the Unicode character sequence for an invalid code point (\uFFFD\uFFFD). The new document would only differ where invalid code points were present, and the new document would now be 100% UTF8 compliant.



 Comments   
Comment by auto [ 19/Mar/13 ]

Author:

{u'date': u'2013-03-18T21:34:39Z', u'name': u'rstam', u'email': u'robert@10gen.com'}

Message: CSHARP-694: Added ReadEncoding and WriteEncoding to MongoDefaults and tweaked ToString output.
Branch: master
https://github.com/mongodb/mongo-csharp-driver/commit/c68130609c998b657e246687767ae03fa02a3940

Comment by auto [ 19/Mar/13 ]

Author:

{u'date': u'2013-03-18T20:37:51Z', u'name': u'rstam', u'email': u'robert@10gen.com'}

Message: CSHARP-694: Added ReadEncoding and WriteEncoding to MongoClientSettings also.
Branch: master
https://github.com/mongodb/mongo-csharp-driver/commit/788edb740a38b389738cd6edc20337411861c4ec

Comment by Robert Stam [ 18/Mar/13 ]

ReadEncoding and WriteEncoding need to be added to MongoClientSettings also.

Comment by auto [ 05/Mar/13 ]

Author:

{u'date': u'2013-03-05T02:59:13Z', u'name': u'rstam', u'email': u'robert@10gen.com'}

Message: CSHARP-694: Added unit tests for using different Encodings with JsonReader(Writer). Realized that the Encoding is actually set at the StreamReader(Writer) level so JsonReader(Writer)Settings don't need an Encoding property.
Branch: master
https://github.com/mongodb/mongo-csharp-driver/commit/1f520429c44c2b852f0f50ef772de369e9d71208

Comment by auto [ 04/Mar/13 ]

Author:

{u'date': u'2013-03-04T21:51:08Z', u'name': u'rstam', u'email': u'robert@10gen.com'}

Message: CSHARP-694: Change values in unit tests to use values that fail in the same way in Mono as in .NET.
Branch: master
https://github.com/mongodb/mongo-csharp-driver/commit/a672a75a1ef0f6d3395312e9515bfaf635b91340

Comment by auto [ 04/Mar/13 ]

Author:

{u'date': u'2013-03-04T17:20:52Z', u'name': u'rstam', u'email': u'robert@10gen.com'}

Message: CSHARP-694: Provide a way to be more lenient about UTF8 validity.
Branch: master
https://github.com/mongodb/mongo-csharp-driver/commit/849a6405c52ccff81ee051bbedafba43ef4759d7

Generated at Wed Feb 07 21:37:34 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.