[CDRIVER-4819] Allow reducing precision when converting BSON double values to JSON Created: 29/Jan/24  Updated: 29/Jan/24

Status: Backlog
Project: C Driver
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Unknown
Reporter: Andreas Braun Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
is depended on by PHPC-2341 Approximative encoding of floats in c... Blocked
Problem/Incident
is caused by CDRIVER-1945 bson_as_json outputs doubles as integers Closed
Related
is related to CDRIVER-2063 JSON export prints insignificant digi... Closed
is related to CDRIVER-3377 Double value retrieved from bson is d... Closed
is related to CDRIVER-3938 Rounding errors in double type in bso... Closed
is related to CDRIVER-3500 Reduce floating point precision requi... Backlog
Assigned Teams:
C Drivers

 Description   

In _bson_as_json_visit_double, BSON double values are converted to strings with a precision of 20 decimal digits. This can lead to issue for some values when they are rounded at the default double precision (~16 digits) but not at the extended precision. An example for such a value is 1.99, which translates to 1.9899999999999999911 when 20 digits precision are available.

This is problematic, as users using a value like above are presented with a different value in the resulting JSON. I'll note that reading the resulting JSON into BSON results in the original value, as the 20 digit precision is not available in a C double.

A possible solution is changing the precision to a lower value: the 53 bits of precision in a double result in 16 digits of decimal precision. If this can be considered a BC break, adding a precision option to _bson_json_opts_t for use in bson_as_json_with_opts could provide a workaround for affected users.



 Comments   
Comment by Kevin Albertson [ 29/Jan/24 ]

I am hesitant to change the default out of concern of possible backwards breaking, but it may be well motivated.

The C driver skips BSON corpus tests related to double precision.

Open questions:

Since double can only represent < 16 decimal digits, does reducing the precision to 16 digits have no impact on round-tripping? Example:

char str20[32];
char str16[32];
 
snprintf(str20, sizeof(str20), "%.20g", 1.99); // 1.9899999999999999911
snprintf(str16, sizeof(str16), "%.16g", 1.99); // 1.99
 
double d20 = strtod(str20, NULL); // libbson parses with `strtod`
double d16 = strtod(str16, NULL);
assert(d20 == d16);
 
// Check that both doubles have the same memory representation.
assert(0 == memcmp(&d20, &d16, sizeof(double)));

If there is no impact on round-tripping, that may suggest less risk of impact. Though the output of libbson could be input to different JSON parsers.

What do other JSON libraries use? https://github.com/json-c/json-c/blob/dabed80523fa5101e30f0ee57ba06b02beae73eb/json_object.c#L1016 shows 17:

const char *std_format = "%.17g";

https://bugs.python.org/issue1580 also references 17.
Does libbson's 20 digits precision produce double representations that cannot be parsed by other drivers? If “yes”, that may help motivate changing the default.

Generated at Wed Feb 07 21:22:02 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.