-
Type:
Improvement
-
Resolution: Unresolved
-
Priority:
Critical - P2
-
Affects Version/s: None
-
Component/s: Performance
-
None
-
None
-
Go Drivers
-
None
-
None
-
None
-
None
-
None
-
None
Context
Recent benchmarking and profiling have shown a CPU and memory performance regression in the V2 bson package compared to V1, despite optimizations such as the reintroduction of sync.Pool (see GODRIVER-3533). The remaining overhead is attributed primarily to the use of buffered IO (bufio.Reader and io.ReadFull) in the bson.(*valueReader) implementation, introduced for simplicity in PR #1698 following discussion in PR #1673.
Profiling indicates significant time and allocations are spent in bufio and related methods, resulting in higher allocations/op (85 vs. 47) and ns/op (4567 vs. 3344) compared to V1. A preliminary proof-of-concept removing the bufio dependency restores parity in both allocation count and CPU time.
See this doc for more details: https://docs.google.com/document/d/10ZbdAqjbIjHK8PuDLD9TxJU1dcYE8vW6klmksKG0tXo/edit?tab=t.0#heading=h.st4i76uaj4yu
Definition of done
Revert the changes from PR #1698 to remove bufio usage in bson’s valueReader, restoring a custom, zero-allocation reading pattern similar to V1. This will eliminate unnecessary allocations and reduce CPU usage per operation. A reference POC and supporting benchmarks have already shown this change eliminates the performance regression.
Pitfalls
This action will trade simplicity in maintaining the BSON package for relative optimizations (i.e. comparing to v1).