• Type: Sub-task
    • Resolution: Fixed
    • Priority: Major - P3
    • 9.0.0-rc0
    • Affects Version/s: None
    • Component/s: None
    • None
    • Storage Execution
    • Fully Compatible
    • Storage Execution 2026-05-11
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Add a google-benchmark microbenchmark for aggregateSizeCountDeltasInOplog and aggregateMultiOpSizeMetadata so that we can establish a baseline today and measure the impact of any future change to the parsing.

      Workloads to cover:

      Each workload generates synthetic oplog BSON in-memory once per benchmark instance (outside the timed loop) and feeds either a vector of BSONObj or a typed input where appropriate. Each workload sweeps its primary dimension via RangeMultiplier(4)->Range(1<<10, 1<<16), giving data points at roughly 1k, 4k, 16k, and 64k. The sweep makes per-entry cost visible at scale and catches non-linear regressions.

      • BM_Scan_AllInsertCRUD: all i ops with m, ui, ns populated, distinct UUIDs. Sweep on entry count.
      • BM_Scan_MixedCRUD: mix of i/u/d roughly 60/30/10. Sweep on entry count.
      • BM_Scan_TransactionalApplyOps: applyOps entries with 100 inner CRUD ops each. Sweep on outer applyOps count (so 1<<10 means ~100k inner ops).
      • BM_Scan_NoiseHeavy: entries dominated by op-types we skip (n, container ops, c with non-interesting command types). Sweep on entry count.
      • BM_Scan_FastCountStoreWrites: entries on the internal config.fast_count_metadata_store collection. Sweep on entry count.
      • BM_AggregateMultiOp_ReplOperation: ReplOperation objects fed to aggregateMultiOpSizeMetadata(const std::vector<repl::ReplOperation&>). Sweep on op count. Covers the prepare write path.

      What to measure:

      • ns/op (per benchmark iteration).
      • Per-entry ns derived via state.SetItemsProcessed(state.iterations() * state.range(0)) so items_per_second is reported and per-entry cost is directly
        comparable across sweep points.
      • For each workload, run against the current master so the parent ticket's PRs can attach a delta.

      Acceptance criteria:

      • Benchmark file builds and runs locally with the standard mongo google-benchmark target.
      • All workloads produce stable numbers across repeated runs (no allocator-warmup or first-run skew dominating the result). Verify by running each workload at least three times.
      • Master baseline numbers are captured and attached to the parent ticket as a comment so future PRs have a single reference point.
      • No production code is changed by this ticket. Benchmark setup is self-contained.

            Assignee:
            Ernesto Rodriguez Reina
            Reporter:
            Ernesto Rodriguez Reina
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: