[CDRIVER-3377] Double value retrieved from bson is different than expected Created: 27/Sep/19  Updated: 29/Jan/24  Resolved: 22/Oct/19

Status: Closed
Project: C Driver
Component/s: libbson
Affects Version/s: 1.13.0
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Aziz Zitouni Assignee: Jeremy Mikola
Resolution: Works as Designed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to CDRIVER-4819 Allow reducing precision when convert... Backlog
is related to CDRIVER-1945 bson_as_json outputs doubles as integers Closed
is related to CDRIVER-652 Number formatting and whitespace in b... Closed

 Description   

Example code:

#include <bson.h>
#include <stdio.h>
 
int main() {
 bson_t b = BSON_INITIALIZER;
 BSON_APPEND_DOUBLE (&b, "a", 3399.99);
 
 size_t len;
 char *str;
 str = bson_as_json (&b, &len);
 printf ("%s\n", str);
 
 bson_free (str);
 return 0;
}

Actual output:

{ "a" : 3399.9899999999997817}

Expected output:

{ "a" : 3399.99}

 

 



 Comments   
Comment by Jeremy Mikola [ 22/Oct/19 ]

Closing this as I don't believe there are any libbson changes to make based on my previous response. Using a Decimal128 type is likely the best way to avoid any issues with inexact representations using normal float types.

Comment by Jeremy Mikola [ 30/Sep/19 ]

I believe this is closely tied to the inability to exactly represent decimal binaries in binary floating point (as discussed in this Stack Overflow thread). I'll lead with noting that MongoDB provides a Decimal128 BSON type specifically to overcome this issue, and I'll revisit that later in this post.

The original format string was "%lf" and dates back to libbson's original commit (f3911ed). This was later changed to "%.15g" in 9eb14d1 for CDRIVER-652, and most recently "%.20g" in 9db0c61 for CDRIVER-1945.

Reading through CDRIVER-652, I think this comment highlights why "%g" was preferable to "%f", as it allows for shorter scientific notation to be used when possible. That reduced non-significant output noise for doubles that do not have an exact floating point representation, but obviously doesn't work in all cases (as you're also reporting here). Offhand, I'm not sure why "%.15g" was employed over "%g" alone. Perhaps Jeroenooms or jesse can chime in on that.

I tested this with a few different patterns using a modified version of the program you shared:

/* gcc example-cdriver-3377.c -o example-cdriver-3377 \
 *     $(pkg-config --cflags --libs libmongoc-1.0) */
 
/* ./example-cdriver-3377 */
 
#include <stdio.h>
#include <mongoc/mongoc.h>
 
int
main (int argc, char *argv[])
{
    bson_t b = BSON_INITIALIZER;
    BSON_APPEND_DOUBLE (&b, "a", 3399.99);
 
    size_t len;
    char *str;
 
    str = bson_as_json (&b, &len);
    printf ("legacy: %s\n", str);
    bson_free (str);
 
    str = bson_as_relaxed_extended_json (&b, &len);
    printf ("extended: %s\n", str);
    bson_free (str);
 
    str = bson_as_canonical_extended_json (&b, &len);
    printf ("canonical: %s\n", str);
    bson_free (str);
 
    bson_iter_t iter;
 
    bson_iter_init_find (&iter, &b, "a");
    printf ("%%f: %f\n", bson_iter_double(&iter));
    printf ("%%g: %g\n", bson_iter_double(&iter));
    printf ("%%.20g: %.20g\n", bson_iter_double(&iter));
 
    return 0;
}

I observed the following output running this program:

$ ./example-cdriver-3377 
legacy: { "a" : 3399.9899999999997817 }
extended: { "a" : 3399.9899999999997817 }
canonical: { "a" : { "$numberDouble" : "3399.9899999999997817" } }
%f: 3399.990000
%g: 3399.99
%.20g: 3399.9899999999997817

Based on that output, one would assume that a simple "%g" format might achieve what you're looking for; however, further testing revealed possible adverse side effects of not specifying a precision modifier with "%g". Using "%g" alone with "1.0000001" prints "1" and truncates the fraction.

If you are attempting to model monetary data, I might suggest using the Decimal128 BSON type instead of a floating point value. That BSON type will ensure a completely accurate representation of your floating point value. It may also require an extra step when parsing JSON output, since the value will be reported as a string value in an object with a "$numberDecimal" key (similar to the canonical output for floating points above, which uses "$numberDouble").

Generated at Wed Feb 07 21:17:51 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.