-
Type:
Task
-
Resolution: Done
-
Priority:
Unknown
-
Affects Version/s: None
-
Component/s: None
-
None
-
Fully Compatible
-
Dotnet Drivers
-
Not Needed
-
-
None
-
None
-
None
-
None
-
None
-
None
Summary
BsonEncodingPoco produces inflated BSON output and uses a stale DataSetSize constant, causing the reported MB/s score to be understated by ~45%.
Background
CSHARP-6003 (DRIVERS-3377) updated DeepPocoNode to use strongly-typed properties:
- right/left (DeepPocoNode) for inner nodes
- rightValue/leftValue (string) for leaf nodes
The updated deep_bson.json from the specs repo uses the same field names. However two issues remain:
Issue 1 — Null field bloat (+48% output size)
DeepPocoNode has four properties per node, but any given node only populates two of them — the other two are null. The C# driver serializes null reference properties by default. On re-serialization, every node emits two extra null BSON fields, inflating the output from 2,286B to 3,392B per document (+48%).
Issue 2 — Stale DataSetSize (-14% baseline)
DataSetSize = 19_640_000 was calibrated to the old deep_bson.json (1,964B per doc). The new file is 2,284B per doc. The constant needs updating to 22_860_000.
Combined effect: BsonEncodingPoco score is understated by ~45% vs what it should be.
Empirically measured against the actual deep_bson.json from the specs repo:
| — | — |
| Raw BSON (new file) | 2,286B |
| POCO-encoded (current) | 3,392B (+48%) |
| POCO-encoded (with fix) | 2,286B (+0%) |
Fix
- Add [BsonIgnoreIfNull] to all four properties on DeepPocoNode in BsonBenchmarkDataTypes.cs
- Update DataSetSize from 19_640_000 to 22_860_000 in BsonEncodingBenchmark.cs and BsonDecodingBenchmark.cs
Note: BsonDecodingPoco is not affected by the null field issue (input bytes are always the raw file), but its DataSetSize constant still needs updating for the file size change.
Relevant spec changes: https://github.com/mongodb/specifications/commit/c593b9e45b923fa1a14c2311b8bae453eaabd4f3
Related
- DRIVERS-3377 — upstream ticket for the deep_bson.json change
CSHARP-6003— C# implementation of DRIVERS-3377 (PR #2018)- The specs repo PR (mongodb/specifications#1885) updated the JSON but did not update the 19,640,000 figure in benchmarking.md — a separate follow-up may be needed there.