[GODRIVER-1142] Improve decodeCommandOpMsg performance Created: 16/Jun/19  Updated: 26/Sep/19  Resolved: 26/Jun/19

Status: Closed
Project: Go Driver
Component/s: Wire Protocol
Affects Version/s: 1.0.3
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Łukasz Marszał Assignee: Kristofer Brandow (Inactive)
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related

 Description   

I was doing some performance tests of my app - in golang, using mongo-driver.
I've seen my app spent lots of time in decodeCommandOpMsg function.

After further investigation I've realized that decodeCommandOpMsg does decode full bson payload (all the way down) just to merge wiremessage sections toghether and return back as binary data (mainDoc.UnmarshalBSON and then mainDoc.MarshalBSON back again).

My theory is that it doesn't need to bother about internal structure of a document whatsoever - just merge top-level fields.

I've written simple benchmark and it seems it can be speeded up by the order of magnitude.

You can see benchmarks code here: https://github.com/g2a-com/mongo-go-driver/commit/1cf2b874b5f8dd022c2b7aed7e9090256eb335d8?diff=unified
And a proof of concept of the improvement here: https://github.com/g2a-com/mongo-go-driver/commit/deeacb2e7aa5e9cd7820c52b4f945fdc5d09be52

Results - without a change:

root@d705c9e30444:/go/src/go.mongodb.org/mongo-driver# go test -benchmem -bench=BenchmarkReadOpMsg ./benchmark
goos: linux
goarch: amd64
pkg: go.mongodb.org/mongo-driver/benchmark
BenchmarkReadOpMsgFlatDecoding-2           20000             89926 ns/op           95792 B/op        609 allocs/op
BenchmarkReadOpMsgDeepDecoding-2           10000            107868 ns/op           50503 B/op       1185 allocs/op
PASS
ok      go.mongodb.org/mongo-driver/benchmark   3.803s

Results - without my change (https://github.com/g2a-com/mongo-go-driver/commit/deeacb2e7aa5e9cd7820c52b4f945fdc5d09be52):

root@d705c9e30444:/go/src/go.mongodb.org/mongo-driver# go test -benchmem -bench=BenchmarkReadOpMsg ./benchmark
goos: linux
goarch: amd64
pkg: go.mongodb.org/mongo-driver/benchmark
BenchmarkReadOpMsgFlatDecoding-2          500000              2993 ns/op            6464 B/op         15 allocs/op
BenchmarkReadOpMsgDeepDecoding-2         1000000              1947 ns/op            2368 B/op         15 allocs/op
PASS
ok      go.mongodb.org/mongo-driver/benchmark   4.516s

 

Fun fact is that without my change in most cases documents returned from mongodb would be unmarshaled twice - once in decodeCommandOpMsg and 2nd time when driver's user wants to read his data (e.g. by cursor.Unmarshal).

Any thoughts?



 Comments   
Comment by Kristofer Brandow (Inactive) [ 26/Jun/19 ]

lmarszal, no problem! Thanks for working with us on this. I'll close this ticket out.

Comment by Łukasz Marszał [ 24/Jun/19 ]

kris.brandow - OK, now I see what do you mean. I didn't see x/network/command/opmsg.go is not being used on master branch.

I ported my change back to v1.0.3 (this is where I did my original investigation): https://github.com/g2a-com/mongo-go-driver/tree/backport_opmsg_performance
Then I run simple benchmark (find and return deep_bson) against three mongo-go-driver versions:

func BenchmarkFind(b *testing.B) {
        // ...
	// some boilerplate...
        // ...
	b.ResetTimer()
	for i := 0; i < b.N; i++ {
		filter := bson.D{{Key: "_id", Value: insertedID}}
		cur, err := col.Find(ctx, &filter)
		if err != nil {
			panic(err)
		}
 
		for cur.Next(ctx) {
			// do nothing - just browse whole cursor
		}
	}
}

Results:

goos: linux
goarch: amd64
pkg: gomongobench_stable
BenchmarkFind-2              3000            419482 ns/op           59534 B/op       1274 allocs/op
PASS
ok       gomongobench_stable     1.332s
 
 
goos: linux
goarch: amd64
pkg: gomongobench_master
BenchmarkFind-2             10000            247603 ns/op            6451 B/op         63 allocs/op
PASS
ok       gomongobench_master     2.522s
 
 
goos: linux
goarch: amd64
pkg: gomongobench_mine
BenchmarkFind-2              5000            251129 ns/op           11079 B/op         99 allocs/op
PASS
ok       gomongobench_mine       1.311s

This clearly shows that problem reported here is fixed on master. Sorry for the confusion.

Comment by Kristofer Brandow (Inactive) [ 21/Jun/19 ]

lmarszal, The files you provided are using the old code. The new code is located here: https://github.com/mongodb/mongo-go-driver/blob/e4bd39d2c760255a391c8c0239833ee2ddb011c2/x/mongo/driver/operation.go#L1098-L1134. This code does minimal parsing and uses the bsoncore package. This shouldn't have the same performance issues as the code from the command package.

Comment by Łukasz Marszał [ 21/Jun/19 ]

kris.brandow, everything is provided above and checked-in into https://github.com/g2a-com/mongo-go-driver/tree/opmsg_performance

I was using data from mongo-go-driver repo (flat_bson.json and deep_bson.json).
Just cherry pick this two files into your benchmark folder in your local copy of mongo-go-driver repo:

Then, run make perf to download test data and you can run these benchmarks with commands provided above.

If you want to test my improvements, download and replace also this file: https://github.com/g2a-com/mongo-go-driver/blob/deeacb2e7aa5e9cd7820c52b4f945fdc5d09be52/x/network/command/opmsg.go

Comment by Kristofer Brandow (Inactive) [ 21/Jun/19 ]

lmarszal, can you please provide the full test files to reproduce?

Comment by Łukasz Marszał [ 21/Jun/19 ]

esha.bhargava - I did benchmark on latest commit to master on github 4f95475. There are three more commits publically available since then (d4dabd4, 2b8bc9c and 24c422e) and benchmarking there gives same results (well... not a suprise as none these commits changes anything around bson or opmsg).

Comment by Esha Bhargava [ 20/Jun/19 ]

lmarszal Can you try benchmarking on master? We have changed this code.

Generated at Thu Feb 08 08:35:46 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.