-
Type:
Improvement
-
Resolution: Unresolved
-
Priority:
Unknown
-
None
-
Affects Version/s: None
-
Component/s: BSON
-
None
-
None
-
None
-
None
-
None
-
None
-
None
The main performance issue that I faced in BSON decoding is permanent copying of strings in bsonrw.valueReader (readString and readCString) methods. Explicit bytes to string casting provokes unnecessary memory allocations. I suggest to use unsafe types converting as it uses in std strings builder. It might decrease significantly allocs for structures with a lot of strings.
I also understand a danger of this manipulation (bytes changes will directly affect on decoded strings), but there are a lot of options for user configurations:
1. The same as easyjson implemented this feature by additional tag
2. UnsafeValueReader, that can be configured globally for `DefaultRegistry`
3. Adding method to ValueReader interface for unsafe string and using StringCodecOptions : UnsafeStringDecoding
The simple replacement (code) below showed substantial improvement in the repository benchmarks:
// old variant func (vr *valueReader) readString() (string, error) { ... return string(vr.d[start : start+int64(length)-1]), nil } // new variant func (vr *valueReader) readString() (string, error) { ... k := vr.d[start : start+int64(length)-1] return *(*string)(unsafe.Pointer(&k)), nil }
I would be happy to discuss other options as well.