-
Type: Bug
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: BSON
BSON UTF8 Validation is currently implemented in javascript and has some edge cases that are not covered by the validation (see: https://en.wikipedia.org/wiki/UTF-8#Overlong_encodings). We should adopt TextDecoder's fatal mode for both browser and Node.js when validation is enabled.
AC
- construct a TextDecoder with the fatal flag set to true when UTF8 validation is enabled.
- catch and rethrow the error generated as a BSONError
- Consider creating a BSONError subclass "BSONUTF8Error"
- set the "cause" property on the BSONError to be equal to the error generated from the fatal decoding
Test AC
- Use the overlong encoding example from wikipedia as a test case
- The following BSON has the sequence "f08282ac" which is the € euro character spread across more bytes than it should be.
- This is considered invalid, but the current BSON implementation generates replacement characters instead
BSON.deserialize(Buffer.from('11000000025f0005000000f08282ac0000', 'hex')) // { _: '����' }