[GODRIVER-1946] Explore github.com/goccy/go-reflect performance Created: 03/Apr/21  Updated: 27/May/21  Resolved: 26/May/21

Status: Closed
Project: Go Driver
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: David Golden Assignee: Matt Dale
Resolution: Done Votes: 0
Labels: matt
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Zip Archive bson-memprofile.zip    
Issue Links:
Related
is related to GODRIVER-494 BSON Codec Redesign Closed
is related to GODRIVER-487 BSON marshal/unmarshal perf has degra... Closed
Documentation Changes: Not Needed

 Description   

A friend tipped me off to github.com/goccy/go-reflect, which is API compatible with reflect, but does zero allocations. This might help address performance/memory issues in marshaling and unmarshaling.

There is some risk departing from the core reflect library, but it might be worth an experiment to see what the benefits are.



 Comments   
Comment by Githook User [ 27/May/21 ]

Author:

{'name': 'Matt Dale', 'email': '9760375+matthewdale@users.noreply.github.com', 'username': 'matthewdale'}

Message: GODRIVER-1946 Fix bson.BenchmarkMarshal error nil check. (#678)
Branch: master
https://github.com/mongodb/mongo-go-driver/commit/b0aff49a1ffa784ad80727cd6376e002a2ef2871

Comment by Matt Dale [ 26/May/21 ]

Closing this because the investigation is done. Conclusion is that using go-reflect won't reduce allocations in the BSON library, but there are other ways we can reduce allocations.

Comment by Matt Dale [ 26/May/21 ]

Doing some memory profiling using the new BSON benchmarks revealed why switching to github.com/goccy/go-reflect didn't have the intended effect. The majority of allocations made while encoding and decoding BSON are in the bsonrw.valueWriter and bsonrw.valueReader types. We should focus our efforts on minimizing allocations and maximizing slice reuse in those types. I've attached the memory profiles for more details.

bson-memprofile.zip

I've created tickets for reducing unnecessary memory allocations in the BSON library based on the findings:

Comment by Githook User [ 25/May/21 ]

Author:

{'name': 'Matt Dale', 'email': '9760375+matthewdale@users.noreply.github.com', 'username': 'matthewdale'}

Message: GODRIVER-1946 Add BSON, extJSON, and JSON benchmarks. (#671)
Branch: master
https://github.com/mongodb/mongo-go-driver/commit/4ed107f564ce8f99035cd4c02b2f0d6c61caa3b2

Comment by Matt Dale [ 18/May/21 ]

benchstat comparison after drop-in replacing all import "reflect" with import "github.com/goccy/go-reflect":

❯ benchstat old_bson.txt new_bson.txt 
name                                 old time/op    new time/op    delta
Marshal/simple_struct/BSON-12          1.48µs ± 2%    1.62µs ± 2%   +9.22%  (p=0.008 n=5+5)
Marshal/nested_struct/BSON-12          3.11µs ± 1%    3.16µs ± 0%   +1.37%  (p=0.008 n=5+5)
Marshal/deep_bson.json.gz/BSON-12      47.6µs ± 1%    51.3µs ± 1%   +7.74%  (p=0.008 n=5+5)
Marshal/flat_bson.json.gz/BSON-12      45.3µs ± 0%    46.5µs ± 0%   +2.72%  (p=0.016 n=4+5)
Marshal/full_bson.json.gz/BSON-12      48.3µs ± 0%    50.9µs ± 0%   +5.39%  (p=0.008 n=5+5)
Unmarshal/simple_struct/BSON-12        5.55µs ± 0%    5.55µs ± 1%     ~     (p=0.794 n=5+5)
Unmarshal/nested_struct/BSON-12        14.1µs ± 3%    14.1µs ± 1%     ~     (p=0.151 n=5+5)
Unmarshal/deep_bson.json.gz/BSON-12    69.9µs ± 0%    71.8µs ± 0%   +2.62%  (p=0.008 n=5+5)
Unmarshal/flat_bson.json.gz/BSON-12    65.9µs ± 0%    65.9µs ± 1%     ~     (p=0.421 n=5+5)
Unmarshal/full_bson.json.gz/BSON-12    79.8µs ± 1%    82.6µs ± 1%   +3.50%  (p=0.008 n=5+5)
 
name                                 old alloc/op   new alloc/op   delta
Marshal/simple_struct/BSON-12            792B ± 0%      792B ± 0%     ~     (all equal)
Marshal/nested_struct/BSON-12            793B ± 0%      793B ± 0%     ~     (all equal)
Marshal/deep_bson.json.gz/BSON-12      20.3kB ± 0%    23.3kB ± 0%  +14.90%  (p=0.008 n=5+5)
Marshal/flat_bson.json.gz/BSON-12      35.4kB ± 0%    39.5kB ± 0%  +11.58%  (p=0.008 n=5+5)
Marshal/full_bson.json.gz/BSON-12      26.6kB ± 0%    29.2kB ± 0%   +9.78%  (p=0.008 n=5+5)
Unmarshal/simple_struct/BSON-12        1.09kB ± 0%    1.09kB ± 0%     ~     (all equal)
Unmarshal/nested_struct/BSON-12        8.12kB ± 0%    8.12kB ± 0%     ~     (all equal)
Unmarshal/deep_bson.json.gz/BSON-12    30.9kB ± 0%    30.9kB ± 0%     ~     (p=0.643 n=5+5)
Unmarshal/flat_bson.json.gz/BSON-12    13.4kB ± 0%    13.4kB ± 0%     ~     (all equal)
Unmarshal/full_bson.json.gz/BSON-12    19.9kB ± 0%    21.6kB ± 0%   +8.31%  (p=0.008 n=5+5)
 
name                                 old allocs/op  new allocs/op  delta
Marshal/simple_struct/BSON-12            3.00 ± 0%      3.00 ± 0%     ~     (all equal)
Marshal/nested_struct/BSON-12            3.00 ± 0%      3.00 ± 0%     ~     (all equal)
Marshal/deep_bson.json.gz/BSON-12         385 ± 0%       448 ± 0%  +16.36%  (p=0.008 n=5+5)
Marshal/flat_bson.json.gz/BSON-12         303 ± 0%       304 ± 0%   +0.33%  (p=0.008 n=5+5)
Marshal/full_bson.json.gz/BSON-12         257 ± 0%       264 ± 0%   +2.72%  (p=0.008 n=5+5)
Unmarshal/simple_struct/BSON-12          62.0 ± 0%      62.0 ± 0%     ~     (all equal)
Unmarshal/nested_struct/BSON-12           143 ± 0%       143 ± 0%     ~     (all equal)
Unmarshal/deep_bson.json.gz/BSON-12       822 ± 0%       822 ± 0%     ~     (all equal)
Unmarshal/flat_bson.json.gz/BSON-12       740 ± 0%       740 ± 0%     ~     (all equal)
Unmarshal/full_bson.json.gz/BSON-12       791 ± 0%       803 ± 0%   +1.52%  (p=0.008 n=5+5)

Actually slower and more allocations with github.com/goccy/go-reflect perplexingly.

Comment by Matt Dale [ 18/May/21 ]

Here's a baseline for current marshal and unmarshal performance of the BSON, mgo BSON, extended JSON, and Go JSON libraries:

❯ go test ./bson/... -run=X -bench='Marshal|Unmarshal' -benchmem
goos: darwin
goarch: amd64
pkg: go.mongodb.org/mongo-driver/bson
cpu: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
BenchmarkMarshal/simple_struct/BSON-12            804342              1476 ns/op             792 B/op          3 allocs/op
BenchmarkMarshal/simple_struct/mgoBSON-12        1000000              1184 ns/op             960 B/op          4 allocs/op
BenchmarkMarshal/simple_struct/extJSON-12         185630              6455 ns/op            3476 B/op         88 allocs/op
BenchmarkMarshal/simple_struct/JSON-12           1343092               889.9 ns/op           240 B/op          1 allocs/op
BenchmarkMarshal/nested_struct/BSON-12            376893              3120 ns/op             793 B/op          3 allocs/op
BenchmarkMarshal/nested_struct/mgoBSON-12         445990              2646 ns/op             960 B/op          4 allocs/op
BenchmarkMarshal/nested_struct/extJSON-12         114273             10198 ns/op            4630 B/op        132 allocs/op
BenchmarkMarshal/nested_struct/JSON-12            995380              1243 ns/op             352 B/op          1 allocs/op
BenchmarkMarshal/deep_bson.json.gz/BSON-12                 25448             47198 ns/op           20314 B/op        385 allocs/op
BenchmarkMarshal/deep_bson.json.gz/mgoBSON-12              18054             66448 ns/op           24198 B/op        638 allocs/op
BenchmarkMarshal/deep_bson.json.gz/extJSON-12              16924             70858 ns/op           36027 B/op        953 allocs/op
BenchmarkMarshal/deep_bson.json.gz/JSON-12                 21895             55768 ns/op           27507 B/op        631 allocs/op
BenchmarkMarshal/flat_bson.json.gz/BSON-12                 26782             44899 ns/op           35383 B/op        303 allocs/op
BenchmarkMarshal/flat_bson.json.gz/mgoBSON-12              16612             72582 ns/op           43667 B/op        641 allocs/op
BenchmarkMarshal/flat_bson.json.gz/extJSON-12              10000            110176 ns/op           91809 B/op       1336 allocs/op
BenchmarkMarshal/flat_bson.json.gz/JSON-12                 17473             68987 ns/op           22009 B/op        301 allocs/op
BenchmarkMarshal/full_bson.json.gz/BSON-12                 24639             48652 ns/op           26564 B/op        257 allocs/op
BenchmarkMarshal/full_bson.json.gz/mgoBSON-12              15468             77155 ns/op           33483 B/op        509 allocs/op
BenchmarkMarshal/full_bson.json.gz/extJSON-12              10000            113155 ns/op           81803 B/op       1254 allocs/op
BenchmarkMarshal/full_bson.json.gz/JSON-12                 18602             64665 ns/op           18802 B/op        277 allocs/op
BenchmarkUnmarshal/simple_struct/BSON-12                  194044              5642 ns/op            1088 B/op         62 allocs/op
BenchmarkUnmarshal/simple_struct/mgoBSON-12               228786              5311 ns/op            1346 B/op         74 allocs/op
BenchmarkUnmarshal/simple_struct/extJSON-12                66997             18086 ns/op            7379 B/op        236 allocs/op
BenchmarkUnmarshal/simple_struct/JSON-12                  190028              6394 ns/op            1040 B/op         58 allocs/op
BenchmarkUnmarshal/nested_struct/BSON-12                   82845             14083 ns/op            8117 B/op        143 allocs/op
BenchmarkUnmarshal/nested_struct/mgoBSON-12               126615              9567 ns/op            5586 B/op        130 allocs/op
BenchmarkUnmarshal/nested_struct/extJSON-12                38109             32365 ns/op           16729 B/op        384 allocs/op
BenchmarkUnmarshal/nested_struct/JSON-12                  174238              6850 ns/op            5250 B/op         74 allocs/op
BenchmarkUnmarshal/deep_bson.json.gz/BSON-12               16954             70555 ns/op           30861 B/op        822 allocs/op
BenchmarkUnmarshal/deep_bson.json.gz/mgoBSON-12            24141             49646 ns/op           27568 B/op        760 allocs/op
BenchmarkUnmarshal/deep_bson.json.gz/extJSON-12             8961            136525 ns/op           64509 B/op       1776 allocs/op
BenchmarkUnmarshal/deep_bson.json.gz/JSON-12               38078             32024 ns/op           23600 B/op        388 allocs/op
BenchmarkUnmarshal/flat_bson.json.gz/BSON-12               18020             67069 ns/op           13431 B/op        740 allocs/op
BenchmarkUnmarshal/flat_bson.json.gz/mgoBSON-12            20113             59859 ns/op           18762 B/op        885 allocs/op
BenchmarkUnmarshal/flat_bson.json.gz/extJSON-12             5020            228594 ns/op           82201 B/op       2659 allocs/op
BenchmarkUnmarshal/flat_bson.json.gz/JSON-12               12666             95112 ns/op           13729 B/op        680 allocs/op
BenchmarkUnmarshal/full_bson.json.gz/BSON-12               14722             81256 ns/op           19929 B/op        791 allocs/op
BenchmarkUnmarshal/full_bson.json.gz/mgoBSON-12            12050            100373 ns/op           38770 B/op       1346 allocs/op
BenchmarkUnmarshal/full_bson.json.gz/extJSON-12             4214            285696 ns/op          111837 B/op       3489 allocs/op
BenchmarkUnmarshal/full_bson.json.gz/JSON-12               12562             95759 ns/op           32785 B/op        844 allocs/op

Comment by Matt Dale [ 14/May/21 ]

Looking into this. Previous investigations into BSON performance used the mgo BSON implementation as the benchmark, so I'll start by running benchmarks for both the mgo BSON library and mongo-go-driver BSON library to get a comparable baseline.

Some history of BSON performance investigation and improvement:

 

Generated at Thu Feb 08 08:37:29 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.