-
Type: Spec Change
-
Resolution: Unresolved
-
Priority: Minor - P4
-
None
-
Component/s: BSON
-
None
-
Needed
Summary
Currently, some drivers validate that BSON strings are valid UTF-8 when decoding them and throw an exception for invalid UTF-8 (which is what the spec tests currently mandate), and some don’t. The behavior should be made consistent in that there should be an option to disable this validation (and fall back to using U+FFFD � replacement characters for these cases), or the behavior should be made uniform across drivers (which would be a breaking change).
Motivation
Who is the affected end user?
Applications using the driver, and in particular tools like mongosh and Compass.
How does this affect the end user?
Currently, there are documents and other server responses that some drivers cannot retrieve, and some drivers can.
How likely is it that this problem or use case will occur?
It’s an edge case for sure. The BSON spec is clear about the fact that BSON strings are encoded using UTF-8, so this is something that shouldn’t even be possible, really. See also DRIVERS-1634.
If the problem does occur, what are the consequences and how severe are they?
Some documents cannot be read at all using some drivers.
Is this issue urgent?
I don’t think so.
Is this ticket required by a downstream team?
Yes, for the mongo shell (mongosh)
Is this ticket only for tests?
This is about functional changes.
- is caused by
-
SERVER-24007 Server can return invalid UTF8 for error messages due to truncation in the middle of a code point
- Backlog
- is related to
-
JAVA-5575 Java Driver allows inserting invalid UTF-8 as string values
- Closed
-
NODE-3627 Getting "Invalid UTF-8 string in BSON document" instead on unique constraint error on bulkWrite.replaceOne
- Closed
-
SERVER-93732 The server should reject inserting/updating strings containing invalid UTF-8
- Needs Scheduling
-
DRIVERS-2008 Default to lossy/replacement behavior when decoding UTF-8 in writeErrors
- Backlog
- related to
-
NODE-3784 Expose disabling utf-8 validation option for users
- Closed