Details
-
Task
-
Resolution: Works as Designed
-
Minor - P4
-
None
-
None
-
None
-
None
Description
Hi everyone,
There are some API calls in mongoc that require BSON arrays as their input. A good example would be aggregation pipelines for instance. But that's hardly the main cause for the following important question.
When marshalling high-level language types to and from BSON, one encounters a problem with how to reliably determine if a root document is an array. There is no such issue for nested arrays because one can rely on the element's type - BSON_DOCUMENT vs BSON_ARRAY.
More specifically, if I want to build a root BSON array, I need to append elements with string keys "0", "1", "2", etc. effectively converting them from the original indexes with bson_uint32_to_string(). The output BSON will be like this:
{ "0" : "aaa", "1" : "bbb", "2" : "ccc", ... }
|
If I fail to do this, the underlying bson_append_array() routine in mongoc will complain in stderr about improperly formed array keys.
But what happens if I want to restore the original array in a high-level language from such a document? Obviously, I need to somehow determine that this document is indeed an array because it would otherwise be incorrect to marshal it back as a document (associative array) with string keys "0", "1", "2", etc.
Should I first parse all the keys converting them from strings to integers and checking if they are in ascending order and then traverse the document again?
If yes, there is no bson_string_to_uint32() call for quick backward conversion, and I am left to use slow GLIBC calls like strtol() to do that.
If no, I can just check the first key and see if it is "0", but that doesn't seem to be reliable to me because one can forge a document like
{ "0" : "aaa", "2" : "bbb", "4" : "ccc" }
|
So, what is the official canonical way to marshal root BSON arrays back to a high-level language?
Of course, this leaves one lingering with a question as why BSON is designed the way that it needs string keys "0", "1", "2",... for arrays at all instead of having a proper way to format arrays without them. This can be done fairly easily retaining backward compatibility.
Thanks!