Details
-
Improvement
-
Resolution: Unresolved
-
Unknown
-
None
-
None
Description
The main performance issue that I faced in BSON decoding is permanent copying of strings in bsonrw.valueReader (readString and readCString) methods. Explicit bytes to string casting provokes unnecessary memory allocations. I suggest to use unsafe types converting as it uses in std strings builder. It might decrease significantly allocs for structures with a lot of strings.
I also understand a danger of this manipulation (bytes changes will directly affect on decoded strings), but there are a lot of options for user configurations:
1. The same as easyjson implemented this feature by additional tag
2. UnsafeValueReader, that can be configured globally for `DefaultRegistry`
3. Adding method to ValueReader interface for unsafe string and using StringCodecOptions : UnsafeStringDecoding
The simple replacement (code) below showed substantial improvement in the repository benchmarks:
// old variant
|
func (vr *valueReader) readString() (string, error) {
|
...
|
return string(vr.d[start : start+int64(length)-1]), nil
|
}
|
|
|
// new variant
|
func (vr *valueReader) readString() (string, error) {
|
...
|
k := vr.d[start : start+int64(length)-1]
|
return *(*string)(unsafe.Pointer(&k)), nil
|
}
|
I would be happy to discuss other options as well.