Fix BsonEncodingPoco benchmark: add BsonIgnoreIfNull to DeepPocoNode and update DataSetSize

XMLWordPrintableJSON

    • None
    • Fully Compatible
    • Dotnet Drivers
    • Not Needed
    • Hide

      1. What would you like to communicate to the user about this feature?
      2. Would you like the user to see examples of the syntax and/or executable code and its output?
      3. Which versions of the driver/connector does this apply to?

      Show
      1. What would you like to communicate to the user about this feature? 2. Would you like the user to see examples of the syntax and/or executable code and its output? 3. Which versions of the driver/connector does this apply to?
    • None
    • None
    • None
    • None
    • None
    • None

      Summary

      BsonEncodingPoco produces inflated BSON output and uses a stale DataSetSize constant, causing the reported MB/s score to be understated by ~45%.

      Background

      CSHARP-6003 (DRIVERS-3377) updated DeepPocoNode to use strongly-typed properties:

      • right/left (DeepPocoNode) for inner nodes
      • rightValue/leftValue (string) for leaf nodes

      The updated deep_bson.json from the specs repo uses the same field names. However two issues remain:

      Issue 1 — Null field bloat (+48% output size)

      DeepPocoNode has four properties per node, but any given node only populates two of them — the other two are null. The C# driver serializes null reference properties by default. On re-serialization, every node emits two extra null BSON fields, inflating the output from 2,286B to 3,392B per document (+48%).

      Issue 2 — Stale DataSetSize (-14% baseline)

      DataSetSize = 19_640_000 was calibrated to the old deep_bson.json (1,964B per doc). The new file is 2,284B per doc. The constant needs updating to 22_860_000.

      Combined effect: BsonEncodingPoco score is understated by ~45% vs what it should be.

      Empirically measured against the actual deep_bson.json from the specs repo:

      Raw BSON (new file) 2,286B
      POCO-encoded (current) 3,392B (+48%)
      POCO-encoded (with fix) 2,286B (+0%)

      Fix

      1. Add [BsonIgnoreIfNull] to all four properties on DeepPocoNode in BsonBenchmarkDataTypes.cs
      2. Update DataSetSize from 19_640_000 to 22_860_000 in BsonEncodingBenchmark.cs and BsonDecodingBenchmark.cs

      Note: BsonDecodingPoco is not affected by the null field issue (input bytes are always the raw file), but its DataSetSize constant still needs updating for the file size change.

      Relevant spec changes: https://github.com/mongodb/specifications/commit/c593b9e45b923fa1a14c2311b8bae453eaabd4f3

      Related

      • DRIVERS-3377 — upstream ticket for the deep_bson.json change
      • CSHARP-6003 — C# implementation of DRIVERS-3377 (PR #2018)
      • The specs repo PR (mongodb/specifications#1885) updated the JSON but did not update the 19,640,000 figure in benchmarking.md — a separate follow-up may be needed there.

            Assignee:
            Boris Dogadov
            Reporter:
            Adelin Mbida Owona
            None
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: