-
Type: Improvement
-
Resolution: Fixed
-
Priority: Unknown
-
Affects Version/s: None
-
Component/s: None
-
None
Description: During a review of the GridFS in sync Java driver, specifically within the GridFSDownloadStream and GridFSUploadStream classes, it was found that byte arrays are cloned twice internally via the Binary class. This behavior impacts performance by reducing throughput.
Initial tests indicate that by minimizing the unnecessary cloning of byte arrays, we can achieve a 33-37% increase in throughput. Throughput is calculated as megabytes per second (MBps), determined by dividing the total megabytes transferred per run by the median time (50th percentile) taken per iterations, measured in seconds.
However, due to existing compatibility commitments documented in the getData() method of the Binary class, directly modifying this behavior could lead to breaking changes.
Proposed Solution: Instead of modifying the Binary class, it is proposed to adjust GridFSDownloadStream and GridFSUploadStream to utilize BsonDocument instead of Document. The BsonDocument implementation does not involve copying in BsonBinary class, which would sidestep the compatibility issues while achieving the performance improvements.