-
Type: New Feature
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: js-bson
It is quite difficult to parse large blobs of bson since this library requires pulling the entirety of the bson blob into memory.
For example, I'm trying to take a closer look at what data is in a particular section of a mongodump file which is all bson. Unfortunately this file is tens of gigabytes in size, making it extremely memory heavy, in some cases these files exceed the memory of the host making it impossible to parse them. If this library supported piped streams the memory footprint would be reduced drastically.
I am aware that there is a method called 'deserializeStream' however that does not seem to have anything to do with node streams as it still requires a buffer be passed in rather than accepting a piped stream.
There is also a 3rd party implementation here: https://github.com/timkuijsten/node-bson-stream. However this implementation is no longer maintained, last update was over 5 years ago. It would be nice to have first party support.
Expected usage:
const fs = require('fs'); const BSON = require('bson'); const bsonStreamParser = new BSON.streamParser(opts); const fileStream = fs.createReadStream('./myData.dump'); fileStream .pipe(bsonStreamParser) .on('data', (deserializedBsonDocument) => { // logic })
- related to
-
NODE-2674 On Demand BSON Access
- Development Complete