Monitor publishes ServerDescriptionChanged events while PushMonitor is active, contrary to SDAM spec

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Unresolved
    • Priority: Unknown
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • None
    • Ruby Drivers
    • None
    • None
    • None
    • None
    • None
    • None

      Problem

      When the streaming protocol is active (MongoDB 4.4+ with serverMonitoringMode=stream or auto off-FaaS), the Ruby driver runs both Mongo::Server::Monitor (polling) and Mongo::Server::PushMonitor (streaming) in parallel. Both call run_sdam_flow and publish ServerDescriptionChanged events for the same logical server state — Monitor with awaited: false and PushMonitor with awaited: true.

      This violates the SDAM "Server Monitoring" specification, which states:

      > "When using the streaming protocol, clients MUST issue a hello or legacy hello command to each server to measure RTT every heartbeatFrequencyMS. The RTT command MUST be run on a dedicated connection to each server."
      > "Clients MUST ignore the response to the hello or legacy hello command when measuring RTT. Errors encountered when running a hello or legacy hello command MUST NOT update the topology."
      > "Clients MUST NOT publish any events when running an RTT command."

      In streaming mode, the regular Monitor's role is RTT measurement, and it must not publish SDAM events. Today it does, leading to duplicate event sequences for steady-state heartbeats and to the alternating non-awaited / awaited log lines seen in RUBY-3456.

      Suggested approach

      Stop calling server.cluster.run_sdam_flow(...) from Monitor#run_sdam_flow (lib/mongo/server/monitor.rb:234) when the server has an active PushMonitor. The Monitor's responsibility in that mode reduces to (a) measuring RTT via the RTT calculator and (b) keeping the connection warm.

      Notes

      • Found while investigating RUBY-3456. Out of scope for that ticket (RUBY-3456 fixes only the noisy log lines at the subscriber layer).
      • Affects every ServerDescriptionChanged and TopologyChanged event consumer, including the SDAM unified test runner — careful migration path required.

            Assignee:
            Unassigned
            Reporter:
            Dmitry Rybakov
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: