Add SingleDocumentLookupStats metrics infrastructure and recorder for change stream lookups

XMLWordPrintableJSON

    • Type: Task
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • Query Execution
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Context

      Change streams enrich update events with a post-image via a single-document lookup performed by one
      of three executors (Aggregation / Express / SBE). We want per-(consumer x engine) metrics so
      operators can see lookup volume, outcomes, and latency, and so tests can assert which executor
      handled a lookup. Metrics are process-global singletons; an executor is per-cursor and ephemeral, so
      it borrows a recorder rather than owning metrics.

      Design: SPM-4535 plan, "Observability" section.

      Scope

      • In single_document_lookup_stats.h / .cpp (new, under src/mongo/db/exec/single_doc_lookup/): the struct SingleDocumentLookupStats (references to one cell's 4 OTEL instruments — Counter found / notFound / notHandled + a latency Histogram, microseconds, buckets ~50us..100ms) and the concrete, non-virtual SingleDocumentLookupStatsRecorder (holds a const reference to one cell; methods recordFound / recordNotFound (Microseconds) and recordNotHandled()).
      • In change_stream_metrics_util.h: a create helper per cell, mirroring the existing createCursors* helpers, building the cell's instruments with serverStatusOptions.dottedPath = "changeStreams.<consumer>.<engine>.<stat>". Each helper's metric name is a one-line MetricNames entry in otel/metrics/metric_names.h — the standard two-part OTEL addition the cursor metrics already use.
      • Add the updateLookup express and aggregation cells (updateLookupExpress, updateLookupAggregation).
      • No executor records yet — no behaviour change.

      Acceptance

      • serverStatus().metrics.changeStreams.<consumer>.<engine>.<stat> appears (zeroed) for every registered cell at startup, on a mongod.
      • The recorder is a concrete, non-virtual type; no interface or mock is introduced — metrics are verified by reading the real OTEL instruments via test utilities (OtelMetricsCapturer).
      • No classic MetricBuilder metric is added (OTEL only).

      Tests

      C++ unit test single_document_lookup_stats_test.cpp: construct a recorder over a real cell, call each
      record method, and read the instruments back with OtelMetricsCapturer (otel/metrics/metrics_test_util.h,
      the cursor_manager_test.cpp pattern). Assert the counters, and that latency is recorded on
      found/not-found but not on not-handled.

            Assignee:
            Denis Grebennicov
            Reporter:
            Denis Grebennicov
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: