In Cloud Free Monitoring we're seeing very slow performance is bson.Marshal()/Unmarshal() (and mongo.Cursor.Decode()).
Where our query might take 10s of millis on the DB, returning ~4000 largish documents, unmarshaling them into our model structs takes 4-5 seconds - approximately 1ms per doc! These docs are on the order of 10kb in size.
We also have an area where we read larger documents (~40kb each), and these are even slower to unmarshal - on the order of 3ms/doc.
Dropping mgo into the same codebase (just for BSON marshal/unmarshal - with mongo-go-driver in-place for driver duties) saw approximately 20x improvement.
Out of curiosity, I put together some benchmarks using canned structs of varying size and object depth, comparing BSON marshal/unmarshal in mgo and mongo-go-driver: https://github.com/dunkyboy/mongo-driver-bench. The benchmarks corroborated our findings, showing mgo at ~3x faster for very simple structs, and 40x+ faster for larger, more complex structs.
FWIW we're planning on working around this in Cloud Free Monitoring by using mgo purely for BSON marshal/unmarshal. But we'd much prefer to just use mongo-go-driver, so looking forward to improvements!