OpTimeObserverDispatcher::notify() causes per-write regression in insert throughput after SERVER-123974

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Fixed
    • Priority: Major - P3
    • 9.0.0-rc0
    • Affects Version/s: None
    • Component/s: None
    • None
    • Storage Execution
    • Fully Compatible
    • ALL
    • Storage Execution 2026-05-11
    • 200
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Problem

      SERVER-123974 introduced a performance regression of 10–27% in insert throughput on the perf-mongo-perf-repl-all-feature-flags sys-perf variant, observed on the insert_read_commands task.

      Root Cause

      SERVER-123974 registers an appliedOpTimeObserver. After this, every primary write calls _setMyLastAppliedOpTimeAndWallTime => _appliedOpTimeDispatcher.notify() => _event.fetchAndAdd(1) + _event.notifyAll(). This triggers a futex_wake syscall. Although the background thread drains quickly (one atomic store) and re-blocks again, at ~300K inserts/second (ARM, replica set): 300K × 400ns = 120ms/sec of pure syscall overhead that previously cost nothing. That's a ~10–15% throughput regression.

      Fix

      Rate-limit OpTimeObserverDispatcher::notify() to fire at most once per wall-clock second by comparing the opTime seconds field against the seconds encoded in _pendingTs (the existing pending-timestamp field).

            Assignee:
            Ernesto Rodriguez Reina
            Reporter:
            Ernesto Rodriguez Reina
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: