[CDRIVER-2168] Escape JSON strings for regex, symbol, and DBPointer types Created: 24/May/17  Updated: 28/Oct/23  Resolved: 16/Jun/17

Status: Closed
Project: C Driver
Component/s: libbson
Affects Version/s: 1.6.3
Fix Version/s: 1.7.0

Type: Bug Priority: Major - P3
Reporter: Jeremy Mikola Assignee: A. Jesse Jiryu Davis
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Related
related to CDRIVER-2128 Support revised Extended JSON spec Closed
is related to CDRIVER-2208 Introduce new API for Extended JSON s... Closed
Case:

 Description   

When converting BSON to JSON and extended JSON, bson_utf8_escape_for_json() should be used to escape strings for the regex, symbol, and DBPointer types. Escaping is currently done for keys, UTF-8 strings, code (w/ and w/o scope) types.

Note: escaping regex patterns with bson_utf8_escape_for_json() causes the following BSON corpus test case to fail: regex with slash. JSON does not require forward slashes to be escaped, so this may mean that bson_utf8_escape_for_json() is too aggressive.



 Comments   
Comment by Ramon Fernandez Marina [ 12/Sep/17 ]

Author:

{'username': u'ajdavis', 'name': u'A. Jesse Jiryu Davis', 'email': u'jesse@mongodb.com'}

Message:CDRIVER-2168 fix JSON UTF-8 escapes

Don't escape "/". Use hex code, not decimal, for u-escapes.
Branch:master
https://github.com/mongodb/libbson/commit/2c030769a3a66168daaa481ed4b4fdc16310e3e9

Comment by A. Jesse Jiryu Davis [ 19/Jul/17 ]

Sounds good, seems like it was my oversight.

Comment by Jeremy Mikola [ 19/Jul/17 ]

jesse: I noticed that your commit above never added escaping to the legacy DBPointer function. Since I'm consolidating the canonical/relaxed/legacy encoding in CDRIVER-2208, the collection name for all DBPointer types will now be escaped in JSON.

I'm not sure if that was an oversight or not, but I thought it worth mentioning the change.

Comment by A. Jesse Jiryu Davis [ 18/Jun/17 ]

Closed in https://github.com/mongodb/libbson/commit/2c030769a3a66168daaa481ed4b4fdc16310e3e9

Comment by David Golden [ 24/May/17 ]

Formally, the RFC says this:

A string begins and ends with
quotation marks. All Unicode characters may be placed within the
quotation marks, except for the characters that must be escaped:
quotation mark, reverse solidus, and the control characters (U+0000
through U+001F).

Any character may be escaped.

My view is that testing extended JSON must account for differences between a driver's JSON output (whitespace, key order, escaping, etc.) and the valid JSON of the corpus. One way to do that would be to "wash" the corpus and generated text through a pure JSON decode/encode cycle that also sorts keys. That's less meaningful for libbson, which doesn't have a meaningful intermediate representation.

Comment by Jeremy Mikola [ 24/May/17 ]

https://github.com/mongodb/libbson/pull/191

Generated at Wed Feb 07 21:14:23 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.