Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Fixed
Priority: Unknown
Fix Version/s: 3.6.0
Affects Version/s: None
Component/s: None
Labels:
None

Confidence Status:
None

Assigned Teams:

Rust Drivers

Documentation Changes:
Not Needed
Documentation Changes Summary:

Hide

1. What would you like to communicate to the user about this feature?
2. Would you like the user to see examples of the syntax and/or executable code and its output?
3. Which versions of the driver/connector does this apply to?

Show
1. What would you like to communicate to the user about this feature? 2. Would you like the user to see examples of the syntax and/or executable code and its output? 3. Which versions of the driver/connector does this apply to?

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Link:
None
Goal Name(s):
None

Summary

Client::shutdown() can deadlock when an SDAM monitor is running in streaming mode. The topology worker's shutdown sequence waits for the monitor to exit, while the monitor is blocked waiting for the topology worker to acknowledge an update.

Reproduction

The bug requires the monitor to complete an awaitable hello and call topology_updater.update() in the narrow window between the topology worker exiting its processing loop and calling close_monitor(). This is difficult to trigger in pure Rust but has been reproduced when the driver is used via FFI with a shared tokio runtime (e.g., a Java/Panama wrapping driver), and in pure Rust by adding a Thread::sleep.

Root Cause

When Client::shutdown() is called:

The TopologyWorker receives Broadcast(Shutdown), broadcasts to connection pools, and breaks out of its event processing loop.

It enters the cleanup sequence: drops the publisher, then calls close_monitor().await on each server's MonitorManager.

close_monitor() drops the WorkerHandle, sends CancellationReason::ServerClosed, and awaits cancellation_sender.closed() — waiting for the monitor task to fully exit.

Meanwhile, the monitor (running in streaming mode) may have just received a response to its awaitable hello. It processes the reply and calls self.topology_updater.update(server_description).await,
which sends an UpdateMessage on the unbounded channel and awaits acknowledgment via a oneshot.

The topology worker's update_receiver is still alive (it's a field on the worker struct), so the send succeeds and the message is buffered. But the worker has already exited its loop and will never recv from the channel again. The oneshot acknowledgment never arrives.

Circular wait:

The topology worker is in close_monitor().await, waiting for the monitor's cancellation receiver to be dropped (i.e., the monitor task to exit).
The monitor task is in topology_updater.update().await, waiting for the topology worker to acknowledge its message.

Neither can make progress.

Fix

Drop self.update_receiver after breaking out of the processing loop, before calling close_monitor(). This causes the UnboundedSender::send() in the monitor's update() call to return Err(receiver gone), so send_message() returns false immediately. The monitor proceeds, checks is_alive() (which returns false because close_monitor dropped the handle), and exits.

Dropping (rather than just closing) the receiver also drops any already-buffered messages, which drops their oneshot senders, unblocking any monitors that already sent an update and are waiting for acknowledgment.

Behavioral Impact

Minimal. During the brief shutdown window, a monitor's final ServerDescriptionChanged SDAM event may be dropped (the topology description won't be updated from that last heartbeat). All shutdown lifecycle events (ServerClosed, TopologyDescriptionChanged, TopologyClosed) are unaffected — they are emitted directly in the shutdown sequence, not through the update channel.

Assignee:: Abraham Egnor
Reporter:: Jeffrey Yemin
Reviewers:: None
Votes:: 0 Vote for this issue
Watchers:: 1 Start watching this issue

Created:: Apr 03 2026 09:37:53 PM UTC
Updated:: Apr 15 2026 09:08:55 AM UTC
Resolved:: Apr 15 2026 09:08:56 AM UTC

Details

Description

Summary

Reproduction

Root Cause

Fix

Behavioral Impact

Attachments

Activity

People

Dates