[CDRIVER-2063] JSON export prints insignificant digits / noise Created: 18/Feb/17  Updated: 29/Jan/24  Resolved: 28/Feb/17

Status: Closed
Project: C Driver
Component/s: json
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Jeroen Ooms Assignee: A. Jesse Jiryu Davis
Resolution: Works as Designed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to CDRIVER-3626 Creating bson from json: Losing decim... Closed
related to CDRIVER-3938 Rounding errors in double type in bso... Closed
related to CDRIVER-4819 Allow reducing precision when convert... Backlog
is related to CDRIVER-652 Number formatting and whitespace in b... Closed

 Description   

Since version 1.6 of the C driver, exporting data to JSON leads to unfortunate rounding/printing behavior. For example a simple value:

> db.test.insert({"x":1.8})
WriteResult({ "nInserted" : 1 })
> db.test.find()
{ "_id" : ObjectId("58a84a0325b646783be8cf6e"), "x" : 1.8 }

Previously, exporting this data with the C driver would lead to the correct output:

> mongo("test")$export()
{ "_id" : { "$oid" : "58a84a0325b646783be8cf6e" }, "x" : 1.8 }

However with the new C driver I get:

> mongo("test")$export()
{ "_id" : { "$oid" : "58a84a0325b646783be8cf6e" }, "x" : 1.8000000000000000444 }

Unfortunately this can be a serious problem for scientific applications where numeric precision is important.



 Comments   
Comment by A. Jesse Jiryu Davis [ 28/Feb/17 ]

Closing based on Luke's comment.

Comment by Luke Lovett [ 21/Feb/17 ]

jesse, I think that %.20g seems reasonable. Even using %g would not solve the problem Jeroen sees, since more precise numbers would be truncated. It seems a better policy to emit more digits than necessary.

FWIW, the Python interpreter does something complicated with repr(some_float) called on a floating point, where it prints the fewest possible digits d_0, d_1, ..., d_n such that float(d_0, d_1, ..., d_n) == float(some_float). This is very convenient, but I don't think it's necessary for everyone to implement. See https://bugs.python.org/issue1580 if interested.

Comment by A. Jesse Jiryu Davis [ 20/Feb/17 ]

Hey Jeroen, I asked @llvt to help answer, since he's the author of our Extended JSON Spec and understands format specifiers better than me. What I want to say is, we've somewhat standardized on printf ("%.20g") as our definitive string-encoding of doubles, as you can see in this test:

https://github.com/mongodb/specifications/blob/deff67d64b61861fee9e02dc5f38f574dc8b0513/source/bson-corpus/tests/double.json#L17-L17

I changed the format string in libbson 1.6 from "%g" to "%.20g", which resulted in the change you see, but it matches the standard tests that all MongoDB drivers must pass so I don't think we should revert it.

Comment by Jeroen Ooms [ 18/Feb/17 ]

It looks like `mongoexport` does not suffer from this problem; only the C driver does.

 mongoexport --db test --collection test
2017-02-18T14:33:18.500+0100	connected to: localhost
{"_id":{"$oid":"58a84a0325b646783be8cf6e"},"x":1.8}
2017-02-18T14:33:18.500+0100	exported 1 record

Generated at Wed Feb 07 21:14:02 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.