[GODRIVER-2146] Implement unsafe bson string decoding Created: 03/Sep/21  Updated: 30/Mar/22

Status: Backlog
Project: Go Driver
Component/s: BSON
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Unknown
Reporter: Vova Sokolov Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: post-1.8.0
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   

The main performance issue that I faced in BSON decoding is permanent copying of strings in bsonrw.valueReader (readString and readCString) methods. Explicit bytes to string casting provokes unnecessary memory allocations. I suggest to use unsafe types converting as it uses in std strings builder. It might decrease significantly allocs for structures with a lot of strings.
I also understand a danger of this manipulation (bytes changes will directly affect on decoded strings), but there are a lot of options for user configurations:
1. The same as easyjson implemented this feature by additional tag
2. UnsafeValueReader, that can be configured globally for `DefaultRegistry`
3. Adding method to ValueReader interface for unsafe string and using StringCodecOptions : UnsafeStringDecoding

The simple replacement (code) below showed substantial improvement in the repository benchmarks:

// old variant
func (vr *valueReader) readString() (string, error) {
    ...
    return string(vr.d[start : start+int64(length)-1]), nil
}
 
// new variant
func (vr *valueReader) readString() (string, error) {
    ...
    k := vr.d[start : start+int64(length)-1]
    return *(*string)(unsafe.Pointer(&k)), nil
}

I would be happy to discuss other options as well.



 Comments   
Comment by Kevin Albertson [ 27/Sep/21 ]

Hello fly.shadow373@gmail.com, thank you for your patience. We will come back to this investigation after we completed urgent work for an upcoming release.

Comment by Matt Dale [ 13/Sep/21 ]

fly.shadow373@gmail.com thanks for reporting this possible improvement, we will look into it soon!

Generated at Thu Feb 08 08:37:55 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.